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IV 


INTRODUCTION 


All  biological  processes  at  the  cellular  level  are  the  consequence  of  a  series  of  chemical- 
physical  reactions  at  the  molecular  level  that  occur  within  the  micro-volume  of  the  cell.  The 
collection  of  molecular  species  and  the  reactions  among  them  is  referred  to  here  as  a 
biomolecular  reaction  network.  The  complete  biomolecular  reaction  network  for  a  cell  includes 
thousands  of  molecular  components  and  reactions  involved  in  transcription,  translation, 
molecular  self-assembly,  metabolic  reactions,  transport  and  physical  movements.  Since  these 
reactions  occur  in  an  extremely  small  reaction  volume,  the  number  of  molecules  of  any  one 
molecular  species  that  can  participate  in  a  given  reaction  can  range  from  single  copies  of  genes 
to  several  hundred  molecules  of  chemicals  at  the  pM  concentration  to  several  hundred  thousand 
molecules  of  chemicals  at  the  mM  concentration.  As  a  consequence  of  the  fact  that  a  subset  of 
all  the  reactions  in  the  system  involve  low  copy  numbers  of  substrate  molecules,  the  behavior  of 
individual  instances  of  the  system  cannot  be  modeled  accurately  using  continuous  deterministic 
(C-D)  approaches..  Thus,  these  natural  micro-systems  should  be  modeled  and  simulated  using 
basic  theory  of  discrete  stochastic  (D-S)  chemical  kinetics. 

With  the  evolution  of  systems  biology  in  recent  years,  interest  in  modeling  and 
simulating  the  behavior  of  engineered  genetic  circuits  in  bacterial  cells  has  increased.  In 
addition  to  living  cells,  nano-biotechnology  researchers  are  exploring  the  possibility  of 
developing  and  using  artificial  cellular  constructs  employing  natural  and  engineered  biological 
processes  (Ishikawa,  et  al.,  2004;  Noireaux  and  Libchaber,  2004;  Noireaux,  et  al.,  2005; 
Oberholzer,  et  al.,  1995;  Pohorille  and  Deamer,  2002;  Yu,  et  al.,  2001).  In  order  to  predict  the 
behavior  of  these  constructs,  modeling  and  simulation  of  their  biomolecular  reaction  networks 
are  needed  to  enable  the  design  and  fabrication  of  both  the  constructs  themselves  and  physical 
devices  based  on  these  constructs. 

In  the  past  ten  years,  several  software  packages  have  been  developed  and  released  to  the 
general  public  that  are  focused  on  simulation  and  analysis  tools  for  modeling  and  simulating 
biological  systems  (e.g.,  Adalsteinsson,  et  al.,  2004;  Dhar,  et  al.,  2004;  Ramsey,  et  al.,  2005; 
Takahashi,  et  al.,  2004).  Each  of  these  software  products  has  its  advantages  and  disadvantages 
for  different  modeling  needs.  We  developed  a  software  package  -  the  Biomolecular  Network 
Simulator  (BNS)  -  that  is  specifically  designed  to  operate  on  either  single  or  multiple  processor 
hardware.  The  software  allow  one  to  build  a  model  of  a  synthetic  biomolecular  reaction  network 
and  to  investigate  its  behavior  using  several  different  stochastic  algorithms.  In  this  paper,  we 
focus  on  the  application  of  the  Biomolecular  Network  Simulator  software  to  an  example  model 
to  illustrate  the  advantages  of  using  multiprocessor  computational  resources.  It  should  be 
recognized  that  many  of  the  features  of  BNS  can  be  found  in  other  simulation  software,  but,  to 
our  knowledge,  the  unique  combination  of  features  in  BNS  cannot  be  found  in  any  other 
software  currently  available. 


METHODS 


Stochastic  Simulation  Algorithm 

The  mathematical  description  of  the  behavior  of  stochastic  biomolecular  reaction 
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networks  is  based  on  Markov  process  theory  (Gillespie,  1992).  The  system  behaves  as  a  multi¬ 
variant,  discrete  state,  Markov  jump  process  and  is  governed  by  the  chemical  master  equation 
(CME).  The  solution  of  the  CME  is  in  fact  the  mathematically  exact  description  of  the  behavior 
of  the  system.  For  our  purposes,  we  will  consider  a  biomolecular  reaction  network  consisting  of 
Ns  identifiable  molecular  species,  denoted  5,  (i  =  1,2, ...,  Ns).  These  molecular  species  can 
undergo  Nr  fundamental  chemical  reactions  /y  (k  =1,2, Nr)  and  are  confined  to  a  fixed 
reaction  volume,  Vr.  It  is  assumed  that  the  system  is  well-mixed  (homogenous)  and  at  constant 
volume  and  temperature.  Let  s(t)  be  an  Ns  dimensional  state  vector  whose  elements  Si(t)  (i  =  1,  2, 
...,  Ns)  are  the  number  of  molecules  in  the  system  of  each  molecular  species  S)  at  time  t. 

The  stochastic  process  that  describes  the  behavior  of  the  biomolecular  reaction  network  is 
characterized  by  the  state  density  function  P(s,  t ) .  This  function  gives  the  probability  that  the 
system  is  in  state  s  at  time  t,  where  s  can  take  on  any  value  in  the  allowable  state  space.  P(s;t)  is 
the  solution  of  the  CME: 

dPisj)  =  ^  ]pk(s-vk)*  P(s-vk  ,t)-ak(s)P(s,t)_  (1) 

dt  t=i 

where  ak(s,t)  is  the  propensity  of  the  kth  fundamental  reaction  and  vk  is  the  state  change  vector,  a 
Ns  dimensional  vector  that  specifies  the  changes  in  the  number  of  molecules  of  each  state 
variable  when  the  kth  reaction  occurs.  Note,  the  sum  is  over  all  of  the  NR  possible  reactions  that 
can  occur.  The  specification  of  the  initial  condition  for  the  biomolecular  reaction  network  of 
interest,  P0(s)  =  P(s,t  =  0),  depends  on  the  precision  and  accuracy  of  the  measurement 

techniques  used  to  experimentally  characterize  the  system.  In  theory,  the  system  is  in  a  single 
well  defined  state  so  at  time  to,  where  the  number  of  molecules  of  each  molecular  species  is  equal 
to  the  exact  number  of  molecules  of  that  species  contained  in  the  reaction  volume  VR  at  time  to. 

In  this  case,  P0  (s)  is  defined  by  the  Kronecker  delta  function  as 

P0(s)  =  P(s.t  =  0)  =  S(s,s0)  (2) 

For  our  purposes,  it  will  be  assumed  that  the  initial  condition  as  defined  by  Equation  (2)  will 
hold  and  the  state  density  function  that  is  the  solution  of  the  CME  can  be  written  as  the 
conditional  probability  density  function  P(s, r|s0,  t0) . 

Usually,  an  analytical  solution  of  the  CME  is  not  possible  and  direct  numerical 
computation  of  the  solution  is  computationally  overwhelming  due  to  the  large  state  space. 
However,  the  direct  simulation  of  exact  (theoretically  possible)  trajectories  in  state  space  is 
feasible  (see  Appendix  A  for  additional  details).  The  time  evolution  of  the  state  vector  s(t)  for  a 
theoretically  possible  instance  of  the  system  can  be  calculated  using  various  algorithms  proposed 
for  Monte  Carlo  simulations  of  stochastic  trajectories.  The  Gillespie  direct  stochastic  algorithm, 
(Gillespie,  1977)  is  used  in  this  report  to  illustrate  the  stochastic  behavior  of  a  simple  gene 
expression  system.  The  Gillespie  direct  stochastic  algorithm  theoretically  generates  exact 
simulations  of  system  trajectories  in  state  space  if  and  only  if  all  reactions  in  the  biomolecular 
reaction  network  are  fundamental  reactions  (Gillespie,  1977).  In  the  limit  of  an  infinite  number 
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of  simulations,  the  statistical  properties  of  the  ensemble  of  exact  simulations  approaches  those  of 
the  exact  solution  of  the  CME,  i.e.,  for  the  first  moment  (mean)  of  s  we  have 


(3) 


where  (s(t))  is  the  estimate  of  the  mean  based  on  an  ensemble  of  n  simulations,  the  left  hand 


sum  is  over  all  possible  states  in  state  space  and  the  right-hand  sum  is  over  all  values  of  the  state 
vector  observed  in  the  n  simulation  runs.  In  addition,  the  variance  of  s  is 


(4) 


where  an  (t)  is  the  estimate  of  the  standard  deviation  based  on  the  ensemble  of  n  simulations. 

Although  the  basic  biochemical  reactions  in  a  biomolecular  reaction  network  are  discrete, 
jump  Markov  processes  and  thus  stochastic  in  nature,  if  the  number  of  molecules  in  the  system  is 
large  then  the  process  can  be  approximated  by  a  continuous  Markov  process  (Gillespie,  1992). 
Furthermore,  if  the  number  of  molecules  and  the  volume  increase  in  proportion  such  that  the 
concentration  of  each  species  is  constant  (the  so-called  thermodynamic  limit),  then  the  solution 
describing  the  behavior  of  the  state  variables  can  be  written  as  the  sum  of  a  sure  variable  that  is 
the  solution  of  the  classical  rate  equations  and  a  variable  factor  that  decreases  in  magnitude  as 

1  /  yjv^  .  Thus,  for  sufficiently  large  reaction  volume,  keeping  concentrations  constant 

(consequently  large  number  of  molecules),  the  first  moment  of  the  probability  density  function  of 
the  state  variables  approaches  the  classical  continuous  deterministic  solution  of  the  reaction  rate 
ODEs.  However,  if  there  are  only  a  few  molecules  of  any  given  species,  as  is  often  the  case  in 
gene  expression,  this  approximation  will  not  accurately  describe  the  instantaneous  state  of  the 
system.  Furthermore,  the  C-D  approach  will  provide  no  information  concerning  the  temporal 
fluctuations  of  state  variables  of  a  given  system  nor  the  variability  between  multiple 
instantiations  of  the  system  with  identical  initial  conditions. 

Biomolecular  Network  Simulator  Software 

The  Biomolecular  Network  Simulator  software  was  developed  to  allow  for  stochastic 
simulations  on  either  desktop  or  multi-processor  hardware  (see  Appendix  B  for  additional  details 
on  the  software  or  http://www.bioanalysis.org  for  complete  documentation).  The  front-end 
graphical  users  interface  (GUI)  and  the  backend  data  analysis  tools  are  written  in  MATLAB. 

This  allows  the  user  to  exploit  the  interactive  features  and  visualization  tools  of  MATLAB  for 
setting  up  simulations  and  analyzing  and  interpreting  the  resulting  data.  The  simulation  engine 
itself  is  written  in  the  C  language  to  maximize  speed  for  the  computationally  intensive  part  of  a 
simulation  run. 

The  BNS  software  accepts  two  types  of  model  definitions:  (1)  Systems  Biology  Markup 
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Language  (SBML)  format  (Huska,  et  al.,  2003)  and  (2)  BNS  format  where  models  are  defined  by 
a  set  of  MATLAB  m-files.  There  are  two  types  of  output  files:  snapshot  data  and  event  log  data. 
Snapshot  data  files  contain  the  state  of  the  system  (number  of  molecules  of  each  molecular 
species)  and  the  number  of  reaction  occurrences  in  each  reaction  channel  since  the  last  snapshot 
at  user  specified  time  intervals.  The  second  type  of  output  files  -  the  event  log  files  -  contain  the 
record  of  every  discrete  event  that  occurs  during  the  simulation. 

Parallelization  of  the  BNS  code  for  simulations  runs  on  high  performance  computing 
hardware  is  accomplished  using  the  Message  Passing  Interface  (MPI).  MPI  consists  of  a  set  of 
MATLAB  scripts  that  implements  a  subset  of  the  Message  Passing  Interface  standard  and  allows 
MATLAB  programs  to  run  on  multiprocessor  architectures.  In  our  parallelization  scheme,  the 
‘master’  processor  divides  the  total  number  of  simulation  runs  into  a  set  of  jobs  depending  on  the 
number  of  available  processors  and  sends  a  job  to  each  of  the  ‘worker’  processors.  The 
snapshot  data  from  the  workers  are  sent  back  to  the  master  processor  for  the  interactive  graphics 
but  the  event  log  files  are  saved  to  the  hard  drive  by  the  workers.  In  this  approach  to 
parallelization,  the  power  of  multiple  processors  is  utilized  to  run  a  large  number  of  simulations 
simultaneously  and  thus  speedup  the  overall  clock  time  for  the  batch  job. 

BNS  allows  the  user  to  select  the  appropriate  ‘Model’  and  ‘Parameters’  directories  and 
set  the  ‘Run’  mode  for  each  simulation  session.  If  simulations  are  run  in  the  interactive  mode, 
the  current  results  of  the  simulation  appear  on  the  monitor  at  specified  plotting  intervals  during  a 
simulation  run.  Usually,  HPC  centers  allocate  limited  resources  (in  terms  of  the  number  of 
processors  and  running  time)  for  interactive  simulations,  therefore  BNS  can  be  run  in  ‘Batch’ 
mode.  In  this  mode  all  output  data  are  stored  directly  on  the  hard  drive  for  post  hoc  analysis. 

The  BNS  software  has  a  comprehensive  set  of  tools  for  post-simulation  analyses.  The 
most  frequently  used  type  of  analysis  is  to  plot  the  number  of  molecules  of  a  particular  molecular 
species  versus  time.  The  number  of  molecules  versus  time  plots  can  be  created  with  both  types 
of  output  files:  snapshot  data  or  event  log  data  with  the  event  log  data  giving  an  exact  description 
of  the  behavior  of  the  selected  state  variable.  A  time-weighted  average  analysis  provides  for  the 
calculations  of  the  average  number  of  molecules  of  a  particular  molecular  species  during  a  user 
selected  time-interval.  The  average  is  weighted  according  to  the  amount  of  time  the  compound 
exists  in  each  state  during  the  selected  time-interval.  The  averaging  analysis  can  be  performed 
for  a  single  simulation  run  or  for  an  ensemble  of  runs.  In  the  latter  case,  the  between  run  average 
(the  average  of  the  individual  time-weighted  average  over  the  ensemble  of  simulation  runs)  and 
standard  deviation  are  plotted. 

Complex  biomolecular  reaction  networks  that  involve  gene  expression  are  usually  stiff 
systems,  i.e.,  contain  reactions  that  occur  on  different  time  scales;  some  reactions  have  a  low 
propensity  and  occur  rarely  while  other  reactions  have  a  high  propensity  and  occur  frequently.  A 
unique  feature  of  the  BNS  software  is  that  the  data  stored  allows  the  user  to  perform  various 
event  rate  analyses  on  the  simulation  data  to  learn  more  about  the  basic  nature  of  the  system. 
Event  rates  (number  of  reaction  events  per  unit  time)  in  each  reaction  channel  can  be  calculated 
for  user-selected  time-averaging  intervals  and  plotted  versus  time.  These  analyses  provide 
important  information  about  the  behavior  of  the  system,  e.g.,  relative  event  rates  for  important 
reactions.  Furthermore,  the  event  rate  data  can  be  used  to  calculate  the  rate  of  energy  utilization 


4 


in  selected  reaction  channels. 


Exemplar  model 

In  order  to  investigate  the  simulation  of  a  biomolecular  reaction  network  with  BNS,  a 
simple  model  of  a  generic  self-assembling  catalytic  ligation  reaction  in  a  cell-free  bacterial 
transcription-translation  (CFTT)  system  is  explored.  The  biomolecular  reaction  network  consists 
of  the  transcription  and  translation  of  a  single  gene  (geneA)  to  form  an  active  catalytic  enzyme 
(Pro_A)  using  a  commercial  gene  expression  system  in  an  artificial  vesicle.  The  system  is 
assumed  to  be  contained  in  a  spherical  liposome  the  size  of  a  bacterial  cell  (reaction  volume  =  5 
x  10'16  L).  The  catalytic  enzyme  is  transcribed  from  a  plasmid  vector  and  the  expressed  protein 
catalyzes  the  ligation  of  substrates  Sub_A  and  Sub_B  to  form  the  product  Prod  A.  The  CFTT 
system  contains  all  of  the  necessary  bacterial  components  for  transcription  of  a  target  gene  from 
a  plasmid  containing  the  T7  bacteriophage  RNA  polymerase  promoter.  In  addition,  the  system 
contains  all  the  necessary  ingredients  for  successful  translation  of  the  mRNA  generated  by  the 
T7-polymerase  into  the  expressed  protein. 

To  formulate  the  simplest,  yet  biochemically  reasonable,  model  of  the  kinetics  of  the  self- 
assembly  of  the  examplar  biomolecular  reaction  network,  the  conceptual  system  model 
illustrated  in  Figure  1  was  proposed.  This  system  consists  of  45  state  variables  and  12  reactions 
(see  Supplementary  Material  for  a  more  complete  description  of  the  model).  Transcription 
consists  of  three  reactions  ( rl  -  r3)  that  include  association  and  dissociation  of  the  T7- 
polymerase  (T7_RNAp)  and  the  T7-promoter  site  for  geneA  (T7_P)  to  form  the  promoter- 
polymerase  complex  (T7_RNAp_T7_P)  and  the  subsequent  formation  of  the  mRNA 
(geneAmRNA).  The  mRNA  can  either  be  degraded  by  a  generic  RNase  (r4)  or  used  as  a 
template  for  protein  synthesis.  Translation  also  consists  of  three  reactions  (r5  -  r7)  that  include 
association  and  dissociation  of  the  small  ribosomal  unit  (Rib_s)  and  the  ribosomal  binding  site 
on  the  geneA_mRNA  to  form  the  ribosomal-mRNA  complex  (Rib_s_geneA_mRNA)  and  the 
subsequent  formation  of  the  protein  product  (Pro_A).  The  protein  product  (Pro_A)  is  capable  of 
catalyzing  the  ligation  of  Sub_A  and  Sub_B  to  form  the  metabolic  product  Prod_A  via  reaction 
r8.  All  proteins  can  be  competitively  degraded  by  a  generic  protease  (Prot),  reactions  r9  -  rl2. 

Since  gene  expression  reactions  involve  a  single  plasmid  contained  in  the  micro-volume 
of  the  vesicle,  the  transcription  and  translation  reactions  are  stochastic  in  nature.  As  discussed 
above,  the  most  accurate  way  to  model  the  biomolecular  reaction  system  is  to  use  a  stochastic 
approach  to  solve  the  CME,  with  the  number  of  molecules  of  each  molecular  species  present  in 
the  micro-volume  as  variables.  However,  the  CME  for  this  system  cannot  be  solved  explicitly. 
Here  we  use  the  Gillespie  direct  stochastic  simulation  algorithm  to  demonstrate  the  advantages 
of  using  the  BNS  software  to  obtain  sufficient  numbers  of  probabilistically  correct  trajectories 
consistent  with  the  CME  through  the  use  of  Monte  Carlo  simulations. 
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Figure  1 :  Schematic  diagram  of  a  single  gene  biomolecular  reaction  network. 
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RESULTS 


Simulation  of  exemplar  model  using  the  Gillespie  Direct  Algorithm 

In  order  to  investigate  the  general  behavior  of  the  exemplar  model,  a  series  of  simulations 
were  run  using  the  following  conditions:  (1)  the  Gillespie  direct  stochastic  simulation 
algorithm,  (2)  an  SBML  model  definition,  (3)  the  stochastic  reaction  parameters  and  initial 
conditions  in  Tables  C.2  and  C.3,  respectively,  in  Appendix  C,  and  (4)  the  following  simulation 
parameters:  duration  of  simulation  =  3600  sec,  snapshot  interval  =  10  sec  (giving  a  total  of  360 
snapshots),  and  number  of  simulations  =  10.  Due  to  the  scale  of  the  model  (45  state  variables),  it 
is  not  possible  to  show  the  total  set  of  data  for  all  state  variables,  but  a  few  selected  and 
important  state  variables  are  shown  in  Figure  2  (remember,  these  are  simulation  data  for  a 
generic  model  and  do  not  necessarily  represent  the  behavior  of  actual  state  variables  and/or 
reaction  rates).  The  data  presented  show  the  trajectory  for  a  single  simulation  and  the  estimated 
mean  (first  moment)  and  standard  deviation  of  the  state  density  function  P(s,t  /so,to)  for  each 
selected  state  variable.  Since  the  biomolecular  reaction  system  under  investigation  is  a  closed 
system,  when  critical  substrates  are  depleted,  the  affected  reactions  stop.  In  this  particular 
system,  three  substrates  prove  to  be  critical:  (1)  AA_Q  (glutamine)  is  depleted  at  about  1400 
sec,  (2)  GTP  is  depleted  at  about  2500  sec,  and  (3)  Sub_A  at  about  3000  sec.  Thus,  even 
though  there  is  adequate  geneAmRNA  present,  protein  synthesis  terminates  at  about  1400  sec 
when  the  limiting  amino  acid,  AA_Q,  is  depleted.  Messenger  RNA  synthesis  terminates  at 
approximately  2500  sec  when  one  of  the  nucleotides,  GTP,  is  depleted.  Note,  GTP  is  utilized  by 
both  mRNA  synthesis  and  protein  synthesis,  thus  if  protein  synthesis  had  not  terminated  at  1400 
sec  due  to  depletion  of  one  of  the  amino  acids,  it  would  have  terminated  at  2500  sec  due  to  the 
depletion  of  GTP.  Finally,  formation  of  the  metabolic  product  Prod  A  terminates  when  one  of 
its  substrates,  Sub_A,  is  depleted  at  3000  sec. 

Each  simulation  run  provides  a  probabilistically  accurate  trajectory  of  the  system  in  state 
space.  However,  the  likelihood  that  any  actual  system  would  follow  the  simulated  trajectory  is 
small.  Thus,  comparison  of  an  single  simulation  run  with  time-series  experimental  data  from  a 
single  vesicle  is  not  particularly  useful,  except  in  the  general  sense  of  trends.  The  value  of 
individual  simulation  runs  is  to  provide  some  intuitive  insight  into  the  possible  behavior  of  the 
system  under  investigation.  For  example,  Figure  3  shows  the  state  space  trajectories  for  protein 
Pro_A  as  generated  by  10  individual  simulations.  In  each  case,  the  ultimate  level  of  protein 
Pro_A  is  108  molecules  in  the  vesicle  (this  is  determined  by  the  limiting  amino  acid  AA_Q). 
However,  the  time  when  protein  synthesis  is  completed  varies  over  a  significant  range, 
approximately  300  sec,  from  1 100  to  1400  sec.  As  a  consequence  of  this  stochastic  variability, 
when  real-time  experimental  data  from  individual  vesicles  are  obtained,  the  only  meaningful 
comparison  is  between  the  experimental  data  and  the  simulation  ensemble  mean  ±  the  standard 
deviation  (right-hand  panels  in  Figure  2).  Two  thirds  of  the  time,  the  experimental  data  should 
fall  within  the  envelop  of  the  mean  ±  the  standard  deviation.  However,  significant  excursions 
from  the  envelop  can  occur  even  when  the  model  is  a  correct  representation  of  the  experimental 
system.  A  better  comparison  between  single  vesicle  experimental  data  and  model  simulations  is 
between  the  experimental  mean  ±  the  standard  deviation  obtained  from  multiple  (many)  single 
vesicle  observations  versus  the  mean  ±  the  standard  deviation  of  an  ensemble  of  a  large  number 
of  simulations  runs  (see  discussion  below  on  the  effect  of  the  number  of  simulation  runs  on 
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estimates  of  the  mean  and  standard  deviations  of  the  probability  density  function  for  the  system). 
If  experimental  data  is  only  obtained  as  the  mean  of  a  large  sample  of  vesicles,  i.e.,  a  grab 
sample  consisting  of  many  vesicles,  then  the  only  meaningful  comparison  is  between  the  macro¬ 
sample  mean  and  the  mean  of  a  large  number  of  simulations  at  corresponding  time  points.  In 
this  case,  no  data  concerning  the  variability  between  individual  vesicles  can  be  obtained.  Note, 
the  standard  deviation  obtained  from  multiple  macro-mean  experiments  still  would  not 
correspond  to  the  fluctuations  exhibited  in  model  simulations,  but  rather  would  be  the  result  of 
experimental  uncertainties  (e.g.,  experimental  measurement  errors  and  non-identical  systems), 
which  are  not  simulated.  In  fact,  if  there  were  no  experimental  error,  then  the  macro-means  of 
multiple  experiments  on  identical  systems  would  be  identical. 
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Figure  2:  Selected  results  for  simulations  of  the  exemplar  model.  The  left-hand  panel  is  a  plot 
of  the  number  of  molecules  of  the  selected  state  variable  versus  time  for  a  single  simulation  run. 
These  plots  were  obtained  from  the  event  log  data  and  include  every  event  that  influenced  the 
particular  state  variable.  The  right-hand  panel  is  an  approximation  to  the  state  density  function 
obtained  by  averaging  the  number  of  molecules  over  10  simulation  runs  at  selected  time  intervals 
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Binned  Compound  Numbers  vs.  Time.  Evaluated  for  runs  1  to  10 
Bin  Size  =  10.  Time  range  =  0  to  3600.  Source  data  file  =  gillespie_100_data 
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Figure  3:  Simulation  data  for  possible  trajectories  in  state  space  for  the  number  of  molecules  of 
protein  Pro_A.  Ten  individual  simulations  were  run  and  the  number  of  molecules  of  Pro_A 
versus  time  are  plotted  for  each  simulation.  Event  log  data  were  used  for  these  plots,  therefore 
every  translation  event  that  produced  a  molecule  of  Pro_A  is  shown  for  each  trajectory. 


Number  of  Molecules  vs.  Time,  showing  every  event,  for  runs  1  through  10 
source  data  file  =  gillespie_100_parsed 


To  further  investigate  the  behavior  of  the  system,  the  event  rates  of  selected  reactions 
were  investigated.  As  a  consequence  of  the  system  behaving  as  a  discrete  jump  Markov  process, 
each  event  occurs  instantaneously  and  the  value  of  associated  state  variables  change 
discontinuously  at  the  time  of  the  event.  As  a  consequence,  there  is  no  derivative  of  the  state 
variables  that  would  correspond  to  the  C-D  concept  of  rate  of  change.  Hence,  for  these 
processes,  the  'reaction  rate'  is  defined  as  the  number  of  events  counted  during  a  time-averaging 
interval  (TAI)  divided  by  that  time  interval,  giving  an  estimate  of  the  event  rate  (number  of 
events  per  unit  time).  These  estimates  will  depend  on  the  TAI  as  illustrated  in  Figure  4.  A  small 
time-averaging  interval  results  in  counting  individual  events  and  dividing  by  a  small  time 
interval  giving  large  fluctuations  within  a  individual  simulation  run  and  between  multiple 
simulation  runs  depending  on  whether  a  particular  time  interval  contains  an  event  or  not.  This  is 
obvious  in  the  TAI  =  1  sec  panel  where  the  between  run  variability  is  large.  On  the  other  hand,  a 
large  time-averaging  interval  will  reduce  the  variability  thus  smoothing  the  data,  but  will  affect 
the  time  resolution  of  dynamical  changes  in  rates  due  to  the  averaging  over  longer  intervals.  For 
the  results  below,  a  time-averaging  interval  of  10  sec  was  selected  to  maximize  time  resolution 
of  system  dynamics  without  significant  artifacts  due  to  too  small  a  time-averaging  interval. 
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Figure  4:  Effect  of  time-averaging  interval  (TAI)  on  estimated  reaction  event  rates.  Estimated 
event  rate  data  was  calculated  using  various  TAIs  from  1  to  600  sec  and  averaged  over  10 
simulations  for  selected  reactions.  The  mean  is  the  average  of  the  estimated  event  rate  for  all  10 
simulation  runs  at  the  given  time  interval  and  the  standard  deviation  reflects  the  variability 
between  runs.  Note  the  difference  in  scale  between  the  TAI  =  1  sec  panel  and  the  other  panels. 
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The  reaction  event  rate  was  computed  for  selected  reactions  using  a  user  defined  time  averaging 
interval  of  10  sec  as  discussed  above  and  the  results  are  shown  in  Figure  5.  In  Figure  5(A)  the 
total  number  of  reaction  events  in  each  reaction  channel  is  shown,  averaged  over  the  10 
simulation  runs.  In  this  examplar  model,  reactions  r5,  r6  and  r8  dominate  the  behavior  of  the 
system.  Reactions  r5  and  r6  are  the  association  and  dissociation  of  the  small  ribosomal  unit 
Ribos  and  the  ribosomal  binding  site  on  gene_A  messenger  RNA,  geneA_mRNA,  and  reaction 
r8  is  the  catalytic  ligation  reaction.  In  figures  5(B)  through  5(F),  both  the  time-averaged  event 
rate  for  a  single  simulation  run  (left-hand  panel)  and  the  mean  ±  one  standard  deviation  for  the 
ensemble  of  10  simulations  (right-hand  panel)  are  shown.  The  reaction  event  rates  vary  during 
the  simulation  depending  on  the  availability  of  substrates  (and  enzymes  where  required)  and 
range  from  0  -  0.3  reactions  per  sec  for  reaction  r3  (transcription)  to  0  -  18  reactions  per  sec  for 
reaction  r8  (catalytic  ligation).  Thus,  the  fastest  reaction  is  about  100  times  faster  than  the 
slowest  reaction.  A  unique  feature  of  stochastic  systems  is  that  the  timing  of  specific  events 
varies  from  one  instance  to  the  next.  An  example  of  this  effect  is  seen  in  Figure  6,  where  the 
reaction  event  rate  for  reaction  r3  (transcription)  is  shown  for  each  of  the  10  simulation  runs. 
These  plots  were  obtained  from  the  snapshot  data  with  a  time-averaging  interval  of  10  sec. 
Above  each  plot  the  time  of  the  last  transcription  event  is  displayed.  The  transcription  reaction 
terminates  when  the  available  GTP  is  depleted  and  ranges  from  2180  to  2551  sec  with  a  mean 
and  standard  deviation  of  2337  ±  136  sec.  Thus,  the  timing  of  any  specific  event  in  a  stochastic 
process  will  always  appear  as  a  distribution  rather  than  a  fixed  time  as  would  be  the  case  for  a  C- 
D  process.  This  effect  will  be  addressed  further  in  the  discussion  of  the  C-D  approximation 
below. 
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Figure  5:  Time-averaged  event  rates  of  selected  reactions.  (A)  Total  number  of  reactions  in 
each  reaction  channel  during  simulation  (left  hand  panel  is  plotted  with  a  linear  scale,  the  right 
hand  panel  uses  a  log  scale).  (B)  -  (F)  Time-averaged  event  rates  for  selected  reactions  - 
number  of  events  per  sec  averaged  over  10  sec  intervals.  Left  hand  panel  shows  the  averaged 
rate  for  a  single  simulation  run.  The  right  hand  panel  is  the  mean  ±  SD  for  the  ensemble  of  10 
simulation  runs. 
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(D)  Reaction  r5  -  Association  of  Rib  s  with  geneA_mRNA  to  form  the  Rib_s_geneA_MRNA 


Binned  Reaction  occurences  vs.  time.  Evaluated  for  runs  1  to  1 
Bin  Size  =  10.  Time  range  =  0  to  3600.  Source  data  file  =  gillespie_100_data 


Binned  Reaction  occurences  vs.  time.  Evaluated  for  runs  1  to  10 
Bin  Size  =  10.  Time  range  =  0  to  3600.  Source  data  file  =  gillespie_100_data 
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Binned  Reaction  occurences  vs.  time.  Evaluated  for  runs  1  to  10 
Bin  Size  =  10.  Time  range  =  0  to  3600.  Source  data  file  =  gillespie_100_data 
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(F)  Reaction  r8  -  Catalytic  Ligation 


Binned  Reaction  occurences  vs.  time.  Evaluated  for  runs  1  to  1  Binned  Reaction  occurences  vs.  time.  Evaluated  for  runs  1  to  10 

Bin  Size  =  10.  Time  range  =  0  to  3600.  Source  data  file  =  gillespie_100_data  Bin  size  =  10  Tjme  range  =  „  t0  36oo.  Source  data  fHe  =  gi||espie_1 00_data 


Time  Time 


Figure  6:  Individual  reaction  event  rate  plots  for  reaction  r3  (transcription)  for  10  simulation 
runs.  Reaction  event  rates  were  calculated  with  a  TAI  of  10  sec.  The  time  of  the  last  reaction 
event  is  displayed  above  each  plot. 


t,ast  =  2268  tlast  =  2456 
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Comparison  between  single  and  multi-processor  simulation  runs 

Running  a  simulation  session  as  a  batch  job  on  multi-processor  HPC  hardware  entails  a 
certain  amount  of  overhead,  e.g.,  the  time  it  takes  to  breakup  the  job  into  smaller  tasks  and  assign 
the  problem  to  each  processor  on  the  front  end  and  the  collection  and  data  storage  on  the  back 
end.  As  a  result,  the  speed-up  gained  by  using  multi-processor  hardware  is  to  a  degree 
dependent  on  how  computationally  intensive  the  problem  is.  For  a  relatively  simple  problem 
that  is  not  particularly  computationally  intensive,  the  majority  of  the  clock  time  for  the 
simulation  session  is  taken  up  with  overhead.  Whereas,  for  a  problem  that  is  computationally 
intensive,  the  computations  involved  in  the  actual  simulation  are  the  time  consuming  component 
of  the  simulation  process.  To  test  this  effect,  we  ran  a  batch  job  with  the  exemplar  model  using 
multi-processor  HPC  hardware  to  evaluate  the  speed-up  in  clock  time  with  increasing  numbers 
of  processors.  Specifically,  we  executed  10000  simulation  runs  of  the  exemplar  model  as  a  batch 
job  on  an  HP  XC  machine  with  distributed  memory  architecture  using  the  Gillespie  direct 
stochastic  simulation  algorithm  and  various  numbers  of  processors  (Figure  7).  Speed-up  was 
calculated  as  the  clock  time  it  took  to  run  the  batch  job  on  a  single  processor  divided  by  the  clock 
time  for  the  same  batch  job  using  multiple  processors.  As  a  consequence  of  the  manner  in  which 
parallelization  using  multiple  processors  was  implemented  (parallel  simulations  on  multiple 
processors),  full  utilization  of  the  BNS  software  should  result  in  a  speed-up  proportional  to  the 
number  of  processors  used.  Up  to  10  processors,  the  speed-up  was  approximately  linear  with  the 
number  of  processors  for  this  computationally  simple  model.  However,  the  speedup  observed  by 
running  the  model  using  20  and  50  processors  in  the  batch  mode  was  only  15.6-  and  19.6-fold, 
respectively.  This  drop-off  in  performance  is  due  to  the  significant  role  that  set-up  overhead 
plays  in  the  total  batch  run  time.  For  this  simple  model,  the  actual  computation  of  the  state 
variable  trajectories  for  each  simulation  run  is  very  small  compared  to  the  time  involved  in 
compiling  and  distributing  the  model  to  each  processor.  Thus,  the  performance  using  more  than 
10  processors  results  in  diminishing  returns  when  the  computational  demand  of  the  simulation 
session  is  small. 

To  further  explore  this  effect,  we  repeated  the  test  with  a  TOx'  exemplar  model,  where 
initial  values  of  all  state  variables  were  increased  by  a  factor  of  10.  This  is  equivalent  having  10 
plasmids  containing  geneA  present  in  the  same  reaction  volume  with  ten  times  the  number  of 
substrate  molecules  available.  The  speed-up  results  using  the  lOx  model  are  also  given  in  Figure 
7.  For  this  computationally  more  complex  problem,  the  value  of  additional  processors  is  clearly 
apparent  even  when  50  processors  are  accessed.  Thus,  the  value  of  multi-processor  hardware  is 
clearly  dependent  on  the  computational  dimensions  of  the  problem. 
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Figure  7 :  Scaling  of  simulation  run  time  with  the  number  of  processors.  Each  model  was  run 
10000  times  as  a  batch  job  using  the  BNS  software  on  an  HP  XC  machine  with  distributed 
memory  architecture  and  the  Gillespie  direct  stochastic  simulation  algorithm  and  various 
numbers  of  processors.  Speed-up  was  calculated  as  the  run  time  for  the  batch  job  on  one 
processor  divided  by  the  run  time  with  the  given  number  of  processors. 


Scaling  of  BNS  on  HP  XC  at  AFRL  MSRC 


Improvement  in  estimating  the  mean  and  standard  deviation  of  state  variables  and  reaction  rates 
with  the  number  of  simulation  runs 

The  mean  and  standard  deviation  of  the  number  of  molecules  averaged  over  the  ensemble 
of  simulation  runs  at  time  t  is  an  estimate  of  the  first  moment  and  variance  of  the  random 
variable  s  as  defined  by  the  solution  of  the  CME,  P(s,t  I sq,  to).  As  the  number  of  simulations 
increases,  these  estimates  improve.  This  can  be  seen  by  inspecting  the  estimated  mean  ±  SD 
between  batch  jobs  with  increasing  numbers  of  simulation  runs  (Figure  8).  The  estimated 
ensemble  mean  ±  SD  for  the  number  of  molecules  of  the  polymerase-promoter  complex 
(T7_RNAp_T7_P)  is  shown  in  Figure  8(A).  For  this  state  variable,  the  possible  states  in  state 
space  are  either  0  or  1,  thus,  the  number  of  molecules  of  the  complex  fluctuate  over  time  from  0 
—>l  or  1  — »  0  in  any  given  simulation  (Figure  8(A),  top  left  plot).  At  any  given  time,  the  mean 
over  simulation  runs  fluctuates  significantly  from  one  sample  time  to  the  next  when  averaged 
over  a  small  number  of  simulation  runs  -  i.e.,  the  mean  appears  to  be  noisy  when  the  number  of 
simulations  are  small  (lower  panel  of  Figure  8(A)).  However,  this  is  merely  a  consequence  of 
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the  approximate  statistical  estimate  of  the  first  moment  of  the  solution  of  the  CME  using  a  small 
number  of  simulations  and  the  standard  error  of  the  mean  will  decrease  with  increasing  n  as 

SD/ yfn  (where  SD  is  the  standard  deviation  of  the  ensemble  distribution).  In  fact,  the  exact 

mean,  (s(t)) ,  is  a  smooth  function  of  time  as  the  series  of  approximations  with  increasing  n  in 

the  lower  panel  of  Figure  8(A)  suggests.  Only  for  estimations  of  the  mean  with  n  >  100  runs 
does  the  shift  in  the  mean  at  approximately  2300  sec  become  well  defined.  This  shift  is  due  to 
the  cessation  of  mRNA  synthesis.  Another  point  to  note  from  the  top  panel  of  Figure  8(A)  is  that 
the  estimates  of  the  SD  of  the  ensemble,  (cr(*))  ,  also  fluctuate  significantly  from  one  time  point 

to  the  next  when  n  is  small,  but  tends  to  smooth  out  with  increasing  n  as  the  estimates  of  the  SD 
improve. 

Figure  8(B)  shows  the  behavior  of  geneA_mRNA  as  n  increases.  Here,  the  estimates  of 
the  ensemble  mean  and  SD  again  shows  significant  fluctuations  from  one  time  point  to  the  next 
when  n  is  small  due  to  the  inaccuracies  in  each  estimate  of  (s(t))n  and  (cr(/))  .  As  n  increases, 
each  individual  estimate  of  the  mean  of  s(t)  improves  and  the  plot  approaches  the  exact  smooth 
curve  for  (s(t))  •  Also,  the  estimates  of  the  SD  also  improve  with  increasing  n  and  the  SDs  from 

one  time  to  the  next  smooth  out.  The  dependency  of  the  accuracy  of  the  estimates  of  the  mean 
and  SD  on  the  number  of  simulations  is  an  issue  that  must  be  taken  into  consideration  when 
dealing  with  stochastic  simulations;  model  predictions  of  experimental  observations  will  only  be 
exact  in  the  limit  of  n  — >  oo  simulations.  Thus,  it  is  necessary  to  use  a  large  number  of 
simulations  to  investigate  the  behavior  of  the  of  the  system  if  one  wished  to  fit  model  predictions 
to  experimental  data.  The  larger  the  number  of  simulations  the  better  the  estimate  of  the  model 
prediction,  thus  reducing  an  additional  source  of  error  that  is  not  present  when  fitting  solutions  of 
C-D  ODEs  to  experimental  data. 


Figure  8:  Comparison  of  estimates  of  the  mean  and  standard  deviation  of  selected  state  variables 
with  increasing  numbers  of  simulation  runs.  For  each  state  variable,  the  top  panel  is  the  mean  ± 
SD  for  various  numbers  of  simulations  plotted  at  10  sec  intervals  and  the  bottom  panel  is  only 
the  mean.  In  each  lower  panel,  the  solution  of  the  C-D  ODE  solution  is  also  given.  (A)  the 
T7_RNAp-T7_P  complex  and  (B)  geneA_mRNA. 
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(A)  T7_RNAp-T7_P 
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n  =  10  runs 


n  =  100  runs 


n  =  1000 


C-D 


1 - vmwj.  1, 

og 

- 

22 


(B)  geneA  mRNA 


n  =  1  runs 


n  =  10  runs 


n  =  100  runs 
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Similar  issues  arise  when  investigating  reaction  event  rates.  Estimates  of  the  event  rate  using  a 
time-averaging  interval  of  10  sec  for  reactions  rl  and  r8  are  given  in  Figure  9.  For  reaction  rl 
(the  association  of  the  polymerase,  T7_RNAp,  with  the  promoter  for  geneA,  T7_P,  to  form  the 
T7_RNAp-T7_P  complex),  a  single  simulation,  n  =  I ,  indicates  that  the  reaction  occurred 
anywhere  from  0  to  4  times  in  any  10  sec  counting  intervals  (corresponding  to  event  rates  of  0  - 
0.4  events/sec)  with  large  fluctuations  from  one  time  point  to  the  next.  If  multiple  simulations 
are  run,  the  estimated  event  rate  can  be  averaged  over  the  ensemble  of  simulations.  As  can  be 
seen  from  Figure  9(A),  averaging  over  multiple  runs  gives  a  more  consistent  estimate  of  the 
mean  and  SD  of  the  event  rate  as  a  function  to  time.  Even  for  a  reaction  that  occurs  at  a 
significantly  greater  rate  than  reaction  rl,  e.g.,  reaction  r8  (Figure  9(B)),  the  effect  of  averaging 
over  multiple  simulations  is  still  apparent. 

Comparison  between  exact  simulations  and  the  C-D  approximation 

Although  the  basic  biochemical  reactions  in  a  biomolecular  reaction  network  are 
stochastic  in  nature,  the  fact  that  some  of  the  molecular  species  in  the  system  are  present  in 
relatively  large  numbers  should  allow  for  the  approximation  of  the  first  moment  of  the  state 
variables  by  the  continuous  deterministic  approach.  To  investigate  this  possibility,  the  exemplar 
model  was  simulated  using  the  C-D  ODE  approach  (see  Supplementary  Material  for  reaction 
parameters).  The  results  are  shown  in  Figure  8  for  the  selected  state  variables  and  in  Figure  9  for 
the  selected  reaction  event  rates.  As  is  evident,  with  one  exception,  the  C-D  approximation  gives 
a  reasonable  representation  of  the  ensemble  average  for  the  state  variables  for  this  particular 
model.  The  one  noticeable  difference  between  the  two  approaches  can  be  seen  in  the  regions 
where  there  is  a  transition  in  the  dynamics  due  to  the  termination  of  certain  reactions.  In  these 
regions,  the  ensemble  means  of  the  S-D  simulations  tend  to  have  smooth  transitions  whereas  the 
C-D  simulation  has  a  sharper  discontinuity.  This  effect  is  due  to  the  variability  in  the  timing  of 
the  transition  in  the  S-D  approach  as  discussed  above.  Each  individual  S-D  simulation  has  a 
rather  sharp  transition  when  these  reaction  terminate,  but  because  the  time  of  the  transition  varies 
from  simulation  to  simulation,  the  ensemble  mean  has  a  smooth  transition.  An  additional 
limitation  of  the  C-D  approximation  is  that  no  information  on  the  variability  in  the  number  of 
molecules  of  state  variables  in  individual  instances  can  be  obtained  from  this  approach. 

Figure  9:  Comparison  of  estimates  of  the  time-averaged  reaction  event  rates  with  increasing 
numbers  of  simulation  runs.  The  time-averaged  event  rates  (time-averaging  interval  =10  sec) 
averaged  over  n  simulation  runs  are  plotted  for:  (A)  reaction  rl  -  association  of  T7_RNAp  and 
T7_P  to  form  the  T7_RNAp-T7_P  complex,  and  (B)  reaction  r8  -  the  catalytic  ligation  reaction. 
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(A)  Reaction  rl 


n  =  1 


n  =  10 


n  =  100 
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(B)  Reaction  r8 


n  =  1000  C-D 
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Discussion 


The  Biomolecular  Network  Simulator  (BNS)  software  developed  in  this  laboratory 
allows  users  to  simulate  the  stochastic  behavior  of  complex  biological  reaction  networks  with  the 
flexibility  of  utilizing  either  single  or  multi-processor  hardware.  Some  of  the  more  important 
features  of  the  BNS  software  are: 

-  the  usage  of  MATLAB  and  C-coded  functions  allowing  the  user  to  combine  intensive 
visualization  of  data  with  high  speed  computations; 

the  implementation  of  parallelized  code  for  multiple  simultaneous  simulations  allowing 
the  user  to  run  BNS  on  multi-processor  machines; 

the  availability  of  the  option  to  run  the  code  in  the  interactive  or  batch  mode; 
a  user  friendly  graphical  user  interface  allowing  the  user  to  easily  select  and  modify 
models,  model  parameters,  simulation  parameters  and  analysis  tools;  and 
a  comprehensive  set  of  analysis  tools  providing  for  post-simulation  analysis  and  plotting 
of  results. 

These  features  allow  the  user  to  perform  a  wide  range  of  simulation  tasks  in  a  relatively 
transparent  environment. 

The  BNS  software  was  used  to  investigate  the  generic  behavior  of  gene  expression  in  a 
cell-free  transcription-translation  system  in  a  closed  system  consisting  of  an  artificial  vesicle. 

The  ability  to  simulate  this  system  allows  one  to  identify  the  critical  factors  that  limit  the 
behavior  of  the  system.  Although  the  model  for  the  system  is  relatively  crude,  it  is  clear  that  the 
availability  of  limiting  amino  acids  controls  the  ultimate  expression  of  proteins,  the  availability 
of  GTP  limits  transcription  of  the  plasmid  gene  to  form  mRNA  and  the  availability  of  substrates 
for  the  catalytic  ligation  reaction  limits  the  generation  of  the  product.  This  quantitative 
knowledge  can  be  used  to  attempt  to  optimize  the  system  to  maximize  production  of  products, 
either  proteins  or  metabolites.  As  the  models  for  the  CFTT-vesicle  system  become  more 
sophisticated,  a  more  detailed  understanding  of  its  behavior  will  evolve. 
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Appendix  A  -  Stochastic  Simulation  Algorithm 

The  mathematical  description  of  the  behavior  of  stochastic  biomolecular  reaction  systems 
is  based  on  Markov  process  theory.  The  system  behaves  as  a  multi-variant,  discrete  state, 
Markov  jump  process  and  is  governed  by  the  chemical  master  equation  (CME).  The  solution  of 
the  CME  is  in  fact  the  mathematically  exact  description  of  the  behavior  of  the  system.  For  our 
purposes,  we  will  consider  a  biomolecular  reaction  network  consisting  of  Ns  identifiable 
molecular  species,  denoted  5,  (i  =  1,  2, ...,  Ns).  These  molecular  species  can  undergo  NR 
fundamental  chemical  reactions  Rk(k  =  1,2, ...,  Nr)  and  are  confined  to  a  fixed  reaction  volume, 
VR.  It  is  assumed  that  the  system  is  well-mixed  (homogenous)  and  at  constant  volume  and 
temperature.  Let  s(t)  be  an  Ns  dimensional  state  vector  whose  elements  st(t)  (i  =  1,  2, ...,  Ns)  are 
the  number  of  molecules  of  each  molecular  species  S,  at  time  t. 

The  probability  that  the  kth  fundamental  reaction  Rk  will  occur  in  the  next  time  interval  dt, 
prob  Rk  (; t ) ,  is  physically  defined  by  molecular  collision  theory  in  solution  and  the  quantum 

mechanics  of  molecular  interactions.  It  has  been  shown  (Gillespie,  1977)  that  a  reasonable 
mathematical  formulation  of  this  probability  is  defined  by: 

prob  Rk(t)  =  ak(s,t)* dt  +  o(dt)  (Al) 

where  ak(s,t)  is  the  propensity  of  the  kth  fundamental  reaction  (see  Table  Al).  Furthermore,  for 
biomolecular  reaction  systems  at  constant  temperature  and  volume,  the  system  is  temporally 
homogenous  which  means  that  ak(s,t)  is  not  explicitly  dependent  on  time. 

ak(s,t)  =  ak(s) 

(A2) 

However,  ak(s)  is  implicitly  dependent  on  time  through  its  dependency  on  s(t).  he  relationship  in 
Equation  (Al)  is  exactly  correct  to  the  order  of  dt.1 

The  Nr  by  Ns  dimensional  stoichiometry  matrix  v  provides  the  necessary  information  as 
to  the  change  in  the  number  of  each  molecular  species  when  a  particular  reaction  occurs.  The  kth 
row  of  the  stoichiometry  matrix,  vk,  is  a  Ns  dimensional  vector,  referred  to  as  the  state  change 
vector,  that  specifies  the  changes  in  the  state  variables  when  the  kth  reaction  occurs. 

The  stochastic  process  that  describes  the  behavior  of  the  biomolecular  reaction  network  is 
characterized  by  the  state  density  function  P(s,  t) .  This  function  gives  the  probability  that  the 
system  is  in  state  s  at  time  t,  where  s  can  take  on  any  value  in  the  allowable  state  space.  P(s;t) 
is  the  solution  of  the  CME: 


i 


Order  of  dt,  indicated  by  o(dt),  means  lim  dt^0 


o(dt) 

dt 


=  0. 
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(3) 


— =  X  ^  (s  —  vk )  *  P(s  —  vk  ,t)  —  ak  (s)P(s,  t ) 
dt  k= i 

Note,  the  sum  is  over  all  of  the  possible  reactions  that  can  occur.  The  first  term  when  summed 
over  all  possible  reactions  give  the  increase  in  probability  due  to  transitions  into  state  s  from 
states  one  reaction  away,  and  the  second  term  give  the  decrease  in  probability  due  to  reactions 
that  originate  in  state  s  and  result  in  transitions  to  new  states  one  reaction  away. 

The  initial  condition  for  the  biomolecular  reaction  network  of  interest,  P0  (s) ,  depends  on 

the  precision  and  accuracy  of  the  measurement  techniques  used  to  experimentally  characterize 
the  system.  In  theory,  the  system  is  in  a  single  well  defined  state  so  at  time  to,  where  the  number 
of  molecules  of  each  molecular  species  is  equal  to  the  exact  number  of  molecules  of  that  species 
contained  in  the  reaction  volume  VR  at  time  to.  Thus,  P0  (s)  can  be  defined  by  the  Kronecker 
delta  function  as 

=  ^(^o) 

(A4) 

For  our  purposes,  it  will  be  assumed  that  the  initial  condition  as  defined  by  Equation  (A4)  will 
hold  and  the  state  density  function  can  be  written  as  the  conditional  probability  P(>,t|,s0,t0) . 

An  analytical  solution  of  the  CME  is  not  possible  usually  and  direct  numerical 
computation  of  the  solution  is  computationally  overwhelming  due  to  the  large  state  space  (the 

Ns 

dimension  of  the  state  space  is  d  =  (sf“x  + 1)  where  sfax  is  the  maximum  number  of 

(=1 

molecules  of  s,  that  can  theoretically  exist  in  the  system).  However,  the  direct  simulation  of 
exact  (theoretically  possible)  trajectories  in  state  space  is  feasible.  For  a  multi-variant,  discrete 
state,  Markov  jump  process,  it  is  possible  to  define  the  next-jump  density  function  p(r,k\s,t) 

such  that:  p{r,k\s,t)  *  dr  is  the  probability  that  the  next  jump  will  occur  between  t  +  rand  t  +  r 

+  dr  and  will  involve  reaction  R/;,  i.e.,  the  event  will  change  the  state  of  the  system  to  s  +  v 
given  that  the  system  was  in  state  s  at  time  t.  It  can  be  shown  for  a  temporally  homogenous, 
multi -variant,  discrete  state,  Markov  jump  process  that 

p(r,  k\ s,  t )  =  pl  (r|  5,  t )  *  p2  (k|  5,  t )  (A5a) 


where 


Nr 

Pi  (r\s,t)  =  2^p(T,k\  s,  t)  =  a0  (5)  *  e~a°{s),r 

k= 1 


(A5b) 
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p2  (k  |  s,  t)  =  |  p(z,  k  |  s,  t)  dr  = 


and 


(A5d) 


p^sj)  is  the  conditional  density  function  for  the  waiting  time  to  the  next  reaction  with  the 
mean  waiting  time  of  l/ao(s ).  p2 ( k\s ,  t)  is  the  conditional  density  function  that  the  next  reaction 

will  be  the  kth  reaction  and  is  equal  to  the  ratio  of  the  propensity  for  the  kth  reaction  to  the  total 
propensity  for  all  reactions  in  the  system. 

The  evolution  of  the  state  vector  s(t)  can  be  calculated  using  various  algorithms  proposed 
for  Monte  Carlo  generation  of  stochastic  trajectories.  The  Gillespie  direct  simulation  algorithm, 
(Gillespie,  1077;  Table  Al),  which  is  one  of  the  algorithms  implemented  in  the  Biomolecular 
Network  Simulator  software,  answers  two  questions:  (1)  which  reaction  will  occur  next,  and  (2) 
what  is  the  waiting  time  for  the  next  reaction  to  occur.  To  answer  these  questions,  two  random 
numbers  uniformly  distributed  over  the  interval  (0,1)  -  rj  and  /*2-  are  generated.  The  first 
random  number  is  used  to  determine  the  next  reaction  Rk  according  to  the  probability  density  in 
Equation  (A5c).  Thus,  k  is  selected  to  satisfies  the  condition 


k- 1  k 

Yjai<r\*ao<Yj  ai 


i=\ 


i=\ 


(A6) 


where  ao  is  given  by  Equation  (A5d). 

The  distribution  of  the  waiting  time  is  given  by  the  probability  density  function  Equation 
(A5b).  The  waiting  time  for  the  next  reaction  is  calculated  as  (Gillespie,  1977) 

1  .  1 

r  =  — log  — 

«0  r2 

(A7) 

Once  the  next  reaction  and  its  waiting  time  are  determined,  the  reaction  is  executed  and 
the  state  of  the  system  is  changed  according  to  the  state-change  vector  vk.  The  simulation  time  is 
increased  by  x  and  any  reaction  propensities  affected  by  the  reaction  are  recalculated.  If  the 
simulation  time  is  less  than  the  total  duration  of  the  simulation  the  algorithm  is  repeated  to 
generate  the  next  simulation  step,  otherwise,  the  simulation  is  terminated.  The  Gillespie 
stochastic  algorithm  is  exact  for  the  fundamental  reactions  indicated  in  Table  A2.  independent  of 
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the  number  of  reacting  molecules  in  the  system. 

A  second  algorithm  based  on  the  modifications  of  Gibson  and  Brack  (2000)  is  also 
included  in  the  BNS  software.  This  algorithm,  which  is  also  mathematically  exact,  allows  for  the 
reuse  of  random  numbers  to  minimize  the  computational  time  required  to  generate  new  random 
numbers  for  each  reaction  step. 

The  third  algorithm  available  in  the  BNS  software  is  the  x-leaping  algorithm  of  Gillespie 
(Gillespie,  2001).  The  tau-leaping  algorithm  calculates  a  time  interval  x  which  encompasses 
more  than  one  reaction  event  and  satisfies  the  Leap  Condition,  i.e.,  the  expected  state  change 
induced  by  the  leap  must  be  sufficiently  small  that  no  propensity  function  changes  its  value  by  a 
significant  amount.  Several  methods  have  been  proposed  recently  to  choose  the  size  of  the  time 
interval  for  tau-leaping  (Chatterjee,  et  al.,  2005;  Gillespie,  D.,  2001,  Gillespie  and  Petzold,  2003; 
and  Tian  and  Burrage,  2004).  We  implemented  the  tau-leaping  method  proposed  by  Cao  et  al. 
(2006).  In  that  method  a  tau  selection  formula  is  given  by 


t  =  min 

ielrv 


max{  sxt  /  gt  ,1}  max{  £xi  /  gi ,1}2 


A,'(*)| 


crf(x) 


(A8) 


where  gi  is  the  highest  order  of  reaction  in  which  species  S)  appears  as  reactant,  s  is  an  error 
control  parameter,  lrs  is  the  set  of  indices  of  all  reactant  species,  and 

j 

(9) 

and 

(10) 


After  the  time  interval  x'  has  been  selected,  the  number  of  firings  of  each  reaction 
channel  during  this  time  interval  is  approximated  as  a  Poisson  random  variable.  The  Poisson 
random  variable  can  have  arbitrary  large  sample  value  and  it  is  possible  that  the  population  of 
some  of  the  molecular  species  can  ran  negative.  Therefore,  a  critical  number  of  molecules  nc 
(typically  in  the  range  of  5-20)  was  introduced.  If  the  number  of  molecular  species  gets  less  then 
nc,  all  reaction  in  which  that  species  appears  as  reactant  are  defined  as  critical.  These  reactions 
are  simulated  by  the  stochastic  simulation  algorithm.  We  calculate  the  sum  of  propensity 
functions  of  all  the  critical  reactions  ac0  and  generate  a  second  candidate  time  x"  according  to 

1  ,  1 

T  ft— log-. 
a0  r 

(11) 
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p2  (k  |  s,  t)  =  |  p(z,  k  |  s,  t)  dr  = 
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p^sj)  is  the  conditional  density  function  for  the  waiting  time  to  the  next  reaction  with  the 
mean  waiting  time  of  l/ao(s ).  p2 ( k\s ,  t)  is  the  conditional  density  function  that  the  next  reaction 

will  be  the  kth  reaction  and  is  equal  to  the  ratio  of  the  propensity  for  the  kth  reaction  to  the  total 
propensity  for  all  reactions  in  the  system. 

The  evolution  of  the  state  vector  s(t)  can  be  calculated  using  various  algorithms  proposed 
for  Monte  Carlo  generation  of  stochastic  trajectories.  The  Gillespie  direct  simulation  algorithm, 
(Gillespie,  1077;  Table  A2),  which  is  one  of  the  algorithms  implemented  in  the  Biomolecular 
Network  Simulator  software,  answers  two  questions:  (1)  which  reaction  will  occur  next,  and  (2) 
what  is  the  waiting  time  for  the  next  reaction  to  occur.  To  answer  these  questions,  two  random 
numbers  uniformly  distributed  over  the  interval  (0,1)  -  rj  and  /*2-  are  generated.  The  first 
random  number  is  used  to  determine  the  next  reaction  Rk  according  to  the  probability  density  in 
Equation  (A5c).  Thus,  k  is  selected  to  satisfies  the  condition 
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where  ao  is  given  by  Equation  (A5d). 

The  distribution  of  the  waiting  time  is  given  by  the  probability  density  function  Equation 
(A5b).  The  waiting  time  for  the  next  reaction  is  calculated  as  (Gillespie,  1977) 
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Once  the  next  reaction  and  its  waiting  time  are  determined,  the  reaction  is  executed  and 
the  state  of  the  system  is  changed  according  to  the  state-change  vector  vk.  The  simulation  time  is 
increased  by  x  and  any  reaction  propensities  affected  by  the  reaction  are  recalculated.  If  the 
simulation  time  is  less  than  the  total  duration  of  the  simulation  the  algorithm  is  repeated  to 
generate  the  next  simulation  step,  otherwise,  the  simulation  is  terminated.  The  Gillespie 
stochastic  algorithm  is  exact  for  the  fundamental  reactions  indicated  in  Table  Al.  independent  of 
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Table  A2:  Summary  of  the  main  steps  in  the  Gillespie  stochastic  simulation  algorithm. 


Step  1:  Initialization.  Set  the  state  vector  s  =  s0  (the  initial  number  of  molecules  of  each  molecular 
species)  and  set  the  time  t  =  0. 

Step  2:  Calculate  the  propensity  a (s )for  each  reaction  R,  and  their  sum  a0(s ) 

Step  3:  Generate  two  random  numbers  r,  and  r2  uniformly  distributed  over  the  interval  (0,1). 

Step  4:  Find  the  next  reaction  Rj  according  to  Eq.  (1). 

Step  5:  Calculate  the  waiting  time  t,  for  the  next  reaction  according  to  Eq.  (4). 

Step  6:  Update  the  number  of  molecules  of  all  molecular  species  affected  by  reaction  Rj  according  to  the 
state-change  vector  v, . 

Step  7:  Update  the  simulation  time  t  -  t  +  i,. 

Step  8:  Calculate  a  new  propensity  at(s)  for  each  reaction  that  was  affected  by  Step  6  and  their  sum  ak(s) 
Step  7:  Return  to  Step  3  or  termination. 

Termination.  Simulations  are  terminated  when  the  simulation  time  exceeds  the  maximal  time  of  the 
simulations. 
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Appendix  B  -  Biomolecular  Network  Simulator  Software 


The  Biomolecular  Network  Simulator  software  is  designed  to  provide  an  environment  to 
perform  stochastic  simulations  and  to  analyze  the  resulting  data.  BNS  can  be  run  on  any 
computer  platform  where  MATLAB  6.5  or  newer  is  installed.  Complete  details  of  the  BNS 
software  can  be  found  at  http://www.bioanalvsis.orq. 


Input  Data 

A  model  is  a  set  of  mathematical  relationships  that  describe  the  behavior  of  biochemical 
reactions  that  control  cellular  biological  processes.  Each  of  the  ‘Model’  directories  contains  one 
or  more  subdirectories  with  model  description  files  and/or  different  set  of  parameters  for  the 
same  model  and  an  ‘Output’  directory  where  the  results  of  simulations  are  stored.  There  are  two 
types  of  model  directories:  one  for  models  defined  in  the  Systems  Biology  Markup  Language 
(SBML)  format  (Hucka,  et  al.,  2003)  and  one  for  models  defined  in  BNS  format  as  a  set  of 
MATLAB  m-files.  There  are  two  additional  m-files  in  each  ‘Parameters’  directory  - 
general_constants.m  and  storing_and_plotting.m,  which  contain  information  about  simulation 
conditions  and  storing  of  the  output  data,  respectively.  In  addition,  BNS  allows  one  to  perform 
simulations  with  multiple  parameter  sets,  with  each  parameter  set  being  run  multiple  times. 
Simulations  with  multiple  parameter  sets  can  be  used  for  optimization  and  sensitivity  analysis  of 
the  model. 

Output  Data 

There  are  two  types  of  output  files:  snapshot  data  and  event  log  data.  Both  of  these  files 
are  in  MATLAB  format.  Snapshot  data  files  contain  the  state  of  the  system  (number  of 
molecules  of  each  molecular  species)  at  user  specified  time  intervals.  The  information  stored  in 
the  snapshot  files  are  used  to  create  runtime  interactive  graphics  and  for  post  hoc  analysis  of  the 
data.  The  second  type  of  output  files  -  the  event  log  files  -  contain  the  record  of  every  discrete 
event  that  occurs  during  the  simulation.  The  user  should  be  aware  that  event  log  files  may 
require  considerable  memory  or  hard  disk  space  and,  therefore,  may  create  memory  management 
problems  for  simulations  involving  a  large  number  of  long  runs  or  for  large  reaction  networks. 

Running  Simulations 

The  BNS  can  be  run  either  in  command  line  mode  or  via  a  GUI.  The  GUI  allows  the  user 
to  modify  model  parameters  at  runtime  and  to  execute  simulations  in  the  interactive  or  batch 
mode  on  HPC  resources.  The  main  dialog  window  of  the  BNS  GUI  is  shown  in  Figure  Bl.  It 
allows  the  user  to  select  the  appropriate  ‘Model’  and  ‘Parameters’  directories  and  set  the  ‘Run’ 
mode.  A  click  on  the  ‘Details’  button  next  to  the  ‘Parameters’  directory  opens  a  new  window, 
shown  in  Figure  B2.  This  dialog  window  allows  the  user  to  modify  model  parameters  and  to  set 
parameters  for  the  simulation.  If  simulations  are  run  in  the  interactive  mode,  the  results  of 
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simulations  will  instantly  appear  on  the  screen.  Usually,  HPC  centers  allocate  limited  resources 
(in  number  of  processors  and  running  time)  for  the  interactive  simulations,  therefore  BNS  can  be 
run  in  ‘Batch’  mode.  In  this  mode  all  output  data  are  stored  on  the  hard  drive  for  further 
analysis. 

Analysis 

The  BNS  has  a  comprehensive  set  of  tools  for  the  post-simulation  analysis.  A  GUI  for 
the  analysis  tools  allows  the  user  to  easily  select  the  data  and  to  set  conditions  for  the  analysis. 
Multiple  types  of  post-simulation  analyses  are  available. 

Plots  of  number  of  molecules  vs.  time 

The  most  frequently  used  type  of  analysis  is  a  plot  of  the  number  of  molecules  vs.  time. 
Such  plots  are  available  in  the  interactive  mode  or  as  post-simulation  analysis.  There  are  two 
ways  to  create  plots:  each  compound  is  plotted  on  a  separate  figure  or  multiple  compounds  are 
plotted  on  the  same  figure  window  (grouping  mode).  The  plot  in  Figure  B3  is  in  grouping  mode 
and  shows  the  behavior  of  two  molecular  species,  SI  and  S2,  over  the  time  interval  of  1500 
seconds  for  a  biomolecular  reaction  network  containing  transcription,  translation  and  metabolic 
reactions.  The  number  of  molecules  vs.  time  plots  can  be  created  with  both  types  of  output  files: 
snapshot  data  or  event  log  data. 

Time-weighted  average  analysis 

A  time-weighted  average  analysis  refers  to  the  calculations  of  the  average  number  of 
molecules  of  a  particular  molecular  species  during  a  user  selected  time-bin.  The  average  is 
weighted  according  to  the  amount  of  time  the  compound  exists  in  each  state  during  the  selected 
time-bin.  The  time  weighted  average  is  then  plotted  versus  time.  The  averaging  analysis  can  be 
performed  for  a  single  run  or  for  a  selected  set  of  runs. 

When  the  analysis  is  applied  to  multiple  runs,  the  plot  shows  the  between  run  average 
(the  average  of  each  run’s  time-weighted  average)  and  standard  deviation.  As  in  the  previous 
case,  the  user  can  plot  each  compound  on  a  separate  figure  or  multiple  compounds  on  the  same 
figure.  Figure  B4  shows  the  between  run  average  of  the  time  weighted  average  number  of 
molecules  for  the  same  50  runs  as  shown  on  Figure  B3  using  a  time  averaging  interval  of  10  sec. 

Reaction  frequency  analysis 

Complex  biomolecular  reaction  networks  usually  contain  reactions  that  occur  on  different 
time  scales:  some  reactions  have  a  low  propensity  and  occur  rarely;  other  reactions  are  very  fast 
and  occur  frequently.  The  data  stored  in  the  event  log  files  allow  the  user  to  perform  various 
reaction  frequency  analyses  of  the  simulation  data  to  learn  more  about  the  basic  nature  of  the 
system.  One  type  of  analysis  creates  plots  of  the  total  number  of  times  each  reaction  occurred 
during  the  simulation.  Figure  B5  shows  an  example  of  a  histogram  of  the  average  number  and 
standard  deviation  of  the  number  of  reaction  occurrences  in  each  reaction  channel  averaged  over 
the  50  runs  in  Figure  B3.  The  number  of  reactions  is  shown  in  the  logarithmic  scale. 
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Reaction  rates  (number  of  reactions  per  unit  time)  in  each  reaction  channel  can  be 
calculated  for  user-selected  time -bins  and  plotted  versus  time.  Such  types  of  analyses  provide 
important  information  about  the  behavior  of  the  system. 
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Figure  Bl:  A  screen  shot  of  the  main  BNS  GUI  dialog  window. 
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Figure  B2:  The  parameters  dialog  window  of  the  BNS  GUI  allows  the  user  to  modify  the  model 
parameters  and  to  set  simulation  parameters. 
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Figure  B3:  The  evolution  of  the  number  molecules  of  molecular  species  SI  and  S2  with  time. 
The  snapshot  data  for  50  runs  are  shown. 


Figure  B4:  The  averaged  number  of  compounds  SI  and  S2  in  the  time  interval  (0,  1500)  for  the 
same  simulation  runs  as  in  Figure  4.  For  each  simulation,  the  time  weighted  average  was 
calculated  using  a  10-sec  time-bin  and  the  time  weighted  averages  were  averaged  over  the  50 
simulation  runs.  Data  for  the  mean  ±  SD  are  shown. 
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Figure  B5:  The  average  total  number  of  reaction  occurrences  in  each  reaction  channel.  The  data 
are  for  the  mean  ±  SD  for  the  50  simulation  runs. 
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Appendix  C  -  geneACFTTOpO  Model  Documentation 

Conceptual  Model 
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Figure  C.  1 :  The  schematic  diagram  of  the  geneA  self-assembling  catalytic  reaction  model. 
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The  diagram  in  Figure  C.l  is  a  schematic  of  the  various  reactions  and  material  flow 
connections  between  state  variables  of  the  geneA_CFTT_OpO  model.  This  is  the  simplest  model 
for  the  biomolecular  reaction  system  that  retains  the  basic  features  of  the  system.  Because  many 
of  the  reactions  are  described  at  the  conceptual  level  as  lumped,  macro-reactions,  this  model 
represents  an  approximation  to  the  exact  fundamental  representation  of  the  biomolecular  reaction 
system  that  would  be  rigorously  compliant  with  the  basic  tenants  of  the  Markov  process  theory 
of  multi-variant,  discrete  state,  temporally  homogenous,  Markov  jump  processes. 

To  transform  the  conceptual  model  into  a  schematic  more  representative  of  the 
mathematical  description  of  the  system,  the  model  schematic  in  Figure  C.l  can  be  redrawn  as  in 
Figure  C.2. 


Components/State  Variables 

O  -  DNA/gene 

•  -  mRNA 
O  -  proteins 

O  -  metabolites 
O  ■  protein-DNA  complex 

•  -  protein-RNA  complex 


Fundamental  Reactions 


Reversible  non-catalytic  reaction 
Irreversible  non-catalytic  reaction 
Reversible  catalytic  reaction 
Irreversible  catalytic  reaction 
Reversible  physical  displacement 
Irreversible  physical  displacement 


Lumped  Reactions 

-  Reversible  lumped  reaction 

-  Irreversible  lumped  reaction 


Figure  C.2:  Schematic  diagram  of  the  mathematical  model  of  the  geneA_CFTT_OpO  model. 
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In  this  representation,  each  reaction  is  defined  and  all  of  the  substrates  for  each  reaction  are 
easily  identified.  In  this  case  the  mathematical  model  of  the  geneA-CFTT  system  is  a  mixture  of 
fundamental  and  lumped  reactions. 

Reactions  for  a  Minimal  Model  ofgeneA  Self-assembly  in  a  Cell-Free  Transcription-Translation 
System 

Transcription  (Reactions  rl  -  r3)) 

In  the  simplest  model  (denoted  by  the  postfix  OpO),  the  transcription  process  in  the  cell- 
free  transcription-translation  (CFTT)  expression  system  is  treated  as  a  two  step  reaction:  (1) 
reversible  association  and  dissociation  of  the  T7  RNA  polymerase  (T7_RNAp)  to  the  T7 
promoter  site  (T7_P)  on  the  plasmid  to  form  the  initiation  complex  (T7_RNAp-T7_P),  and  (2) 
creation  of  the  geneA  mRNA  (geneA_mRNA)  in  an  instantaneous  event  at  a  time  determined  by 
the  stochastic  probability.  The  formation  of  the  mRNA  utilizes  the  appropriate  number  of 
nucleotide  triphosphates  (NTPs:  ATP,  GTP,  CTP  and  UTP)  and  creates  one  pyrophosphate 
molecule  (PPi)  as  byproduct  per  NTP  incorporated.  The  number  of  each  NTP  used  is  determined 
by  the  sequence  of  the  geneA  gene  and  the  number  of  PPi  formed  will  be  equal  to  the  total 
number  of  nucleotides  in  the  translated  portion  of  the  geneA  gene. 

The  specific  reactions  involved  in  transcription  are: 

Reaction  rl :  The  initial  step  in  transcription  of  the  geneA  gene  is  the  association  of  T7  RNA 
polymerase  with  its  promoter  site  on  the  plasmid  containing  the  geneA  gene. 

T7_P  +  T7_RNAp  ->  T7_RNAp-T7_P 

Reaction  r2:  Dissociation  of  the  polymerase-promoter  complex. 

T7_RNAp-T7_P  ->  T7_P  +  T7_RNAp 

Reactions  rl  and  r2  are  fundamental  stochastic  reactions  that  are  described  by  standard 
Markov  propensities  for  bi-  and  uni-  substrate  reactions,  respectively: 

at  =  c,  *T7_RNAp  *T7_P/VR 


and 

a2=c2  *T7  _  RNAp  -T7  _P 


As  there  is  only  one  plasmid  in  the  reaction  volume,  the  state  variable  T7_P  that  represents  the 
T7  promoter  on  the  plasmid  can  fluctuate  only  between  values  of  1  and  0,  where  T7_P  =  1 
implies  the  geneA  T7  promoter  site  is  unoccupied  and  T7_P  =  0  when  T7_RNAp  is  associated 
with  the  promoter  site.  Thus, 
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_  k  *T1  _  RNAp  IV R  when  T1  _P  =  1 
1  [O  when  T1  _  P  =  0 

and 

\c2  when  T1  _  RNAp  -  T1  _  P  =  1 

“2  [O  when  T1  _  RNAp  -T7_P  =  0 

Reaction  r3 :  Elongation  and  ultimate  formation  of  the  geneA_mRNA  polymer  by  the 
incorporation  of  the  nucleotide  tri-phosphates  (UTP,  CTP,  ATP  and  GTP)  into  the  RNA 
polymer.  Inorganic  pyrophosphate  (PPi)  is  formed  as  a  byproduct. 

T7_RNAp-T7_P  +  381  UTP  +  429  CTP  +  377  ATP  +  369  GTP  -> 

geneA  mRNA  +  T7  P  +  T7_RNAp  +  1556  PPi 

where  the  nucleotide  composition  of  a  generic  geneA  has  been  used.  Reaction  r3  is  a  lumped, 
macro-reaction  and  must  be  treated  as  an  approximation  to  the  exact  series  of  fundamental 
reactions  that  constitute  the  transcription  reaction.  Again,  as  the  consequence  of  there  being  only 
one  plasmid  in  the  reaction  volume,  when  T7_RNAp  is  associated  with  the  promoter  site, 
T7_RNAp-T7_P  =  1,  and  the  transcription  reaction  (r3)  can  occur.  The  propensity  for  reaction 
r3  is 

a3  =  c3  * T1  _ RNAp  -Tl_P*f3 (ATP, GTP, CTP, UTP) / U* 
which  can  take  on  only  two  values 

_|0  when  T1  _  RNAp  -T1  _  P  =  0 

a 3  ~  [c3  *  f3 ( ATP , GTP, CTP, UTP) / when  T1  _  RNAp  -Tl_P  =  l 


where  f3(ATP,GTP,CTP,UTP)  describes  the  dependency  of  the  polymerization  reaction  on  the 

NTP  substrates.  Obviously,  a  detailed  model  of  the  transcription  reaction  would  be  more  exact 
and  the  natural  dependency  of  the  reaction  on  the  availability  of  substrates  would  appear  as  a 
consequence  of  the  fundamental  reactions  involved  in  transcription.  However,  for  the  purposes 
of  this  simple  model,  we  will  treat  r3  as  a  lumped  macro-reaction  and  use  an  approximate 
phenomenological  formulation  for  the  propensity  based  on  the  Michaelis-Menten  approximation 
for  isomerization  reactions.  Thus, 


f3  ( ATP  ,GTP,  CTP ,  UTP ) 


ATP  ) 

* 

GTP  ) 

* 

ATP  +  km3  _  ATP  J 

v  GTP  +  km3  _  GTP , 

CTP 

f  UTP  \ 

* 

{  CTP  +  km3  _  CTP  / 

UTP +  km3_UTP  J 
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Note,  0  <  fj(s4,s5,s6,s7)  <  1,  thus  this  formulation  guarantees  that  the  propensity  will  be  zero  if 
any  one  of  the  substrates  is  depleted,  and  will  be  limited  to  a  maximum  value  of  cj  even  when 
there  is  excess  of  the  NTP  substrates. 

The  probability  that  transcription  will  occur  is  non-zero  as  long  as  the  polymerase  binds 
to  the  promoter  and  there  are  sufficient  NTP  substrates  available  to  form  complete  mRNA 
polymers.  As  transcription  progresses,  the  geneAmRNA  formed  serves  as  either  the  substrate 
for  the  translation  process  or  for  the  mRNA  degradative  process. 

Translation  (Reactions  r5-r7) 

As  with  the  transcription  process,  in  this  simplest  conceptual  model  (geneA_CFTT  OpO), 
the  translation  process  is  treated  as  a  two  step  process:  (1)  reversible  association-dissociation  of 
the  ribosomal  small  unit  (Rib  s)  to  the  ribosomal  binding  site  (RBS)  on  geneA_mRNA,  and  (2) 
translation  of  the  mRNA  into  the  Pro_A  protein  by  an  instantaneous  event  at  a  time  determined 
by  the  stochastic  probability. 

The  three  translation  reactions  are: 

Reaction  r5:  Association  of  the  ribosomal  small-unit  (Rib  s  -  30S  subunit)  with  the  ribosomal 
binding  site  (RBS)  on  the  geneA  mRNA. 

geneA  mRNA  +  Rib  s  ->  Rib_s-geneA_mRNA 

Reaction  r6\  Dissociation  of  the  ribosomal  small  subunit  from  the  RBS  on  the  geneA-mRNA. 

Rib_s-geneA_mRNA  ->  geneA  mRNA  +  Rib  s 

As  with  transcription,  the  translation  reactions  r5  and  r6  are  fundamental  stochastic 
reactions  that  are  described  by  standard  Markov  propensities  for  bi-  and  uni-  substrate  reactions, 
respectively: 

a5  =  c5*  Rib  _s*  geneA  _  mRNA  /  VR 


and 

a6  =  c6  *  Rib  _s  —  geneA  _  mRNA 


Reaction  r7:  Elongation  and  ultimate  formation  of  the  Pro_A  protein  by  the  incorporation  of  the 
appropriate  number  of  amino  acids  into  the  protein  polymer.  Upon  completion  of  the  reaction, 
the  Pro_A  protein  is  released  and  the  geneA  mRNA,  ribosomal  large  subunit  (Rib  l)  and  the 
ribosomal  small  subunit  (Rib  s)  are  returned  to  the  available  pools  for  reuse.  Guanidine 
diphosphate,  adenine  diphosphate  and  inorganic  phosphate  (Pi)  are  formed  as  by-products. 


45 


Rib_s-geneA_mRNA  +  Rib_l  +  44A  +  9C  +  27D  +  43E  +  22F  +  40G  +  7H  +  31I  + 
23K  +  53L+13M+18N  +  25P  +  20Q  +  34R  +  29S  +  29T  +  21V  +  8W  +  21 
Y  +  1552  GTP  +  517  ATP  ->  geneA  mRNA  +  Rib  s  +  Rib  l  +  Pro_A  +  1552  GDP 
+  517  ADP  + 2068  Pi 


Note,  the  amino  acid  composition  of  Pro_A  is  based  on  the  nucleotide  sequence  of  geneA.  Once 
Rib  s  is  bound  to  the  geneA_mRNA  forming  the  Rib_s-geneA_mRNA  complex,  the  protein 
product,  Pro_A  ,  can  be  assembled  via  the  translation  reaction  (r7)  with  a  phenomenological 
propensity  of 

a7  =  c7  *  Rib  _  Z  *  /7  (Rib  _s-  geneA  _  mRNA ,  ATP,  GTP,  AA  _  I, ... ,  AA  _V)/VR 


where  the  substrate  dependency  is  given  by 
f  1  (Rib  _s- geneA  _mRNA,  ATP  ,GTP ,  AA  _  I ,... ,  AA  _V) 


Rib  _s  —  geneA  _  mRNA 

* 

ATP  y 

v  Rib  _  s  -  geneA  _  mRNA  +  kml  _  Rib  _  s  -  geneA  _  mRNA  j 

y  ATP  +  kml  _  ATP  J 

AA  V 


AA  V  +  kml  AA  V 


Here,  the  propensity  of  the  translation  reaction  is  modulated  by  the  availability  of  the  substrates, 
including  the  Rib_s-geneA_mRNA  complex,  energy  substrates  GTP  and  ATP,  and  all  of  the 
amino  acids  (AA_I  -  AA_V).  The  modulation  factor  fs,  0  <  f$  <1,  is  of  the  form  of  the  product 
of  hyperbolic  factors  and  accounts  for  certain  realistic  features  of  the  reaction,  namely,  a  zero 
propensity  when  any  of  the  substrates  is  not  available  to  complete  the  polymerization  reaction 
and  a  maximum  propensity,  C7*Rib_l/VR ,  when  all  substrates  are  available  in  saturating 
concentrations. 

Metabolic  Reaction 

The  enzyme  mediated  metabolic  reaction  catalyzed  by  Pro_A  results  in  the  ligation  of 
Sub_A  and  Sub_B  to  form  Prod_A  using  ATP  as  a  source  of  free  energy. 

Reaction  r8:  Catalytic  ligation  of  Sub_A  and  Sub_B  to  form  Prod_A  using  ATP  as  a  cofactor: 

Pro_A  +  Sub_A  +  Sub_B  +  ATP  ->  Pro_A  +  ProdA  +  ADP  +  Pi. 

The  synthetic  reaction  forming  Prod  A  is  mediated  by  the  product  of  the  geneA  gene. 

The  enzyme  Pro_A  utilizes  ATP  to  carry  out  the  ligation  of  Sub_A  and  Sub_B.  The  sequence  of 
micro-reactions  that  constitutes  the  ligation  reaction  -  association  and  dissociation  of  substrates 
ATP,  Sub_A  and  Sub_B  and  the  enzyme,  splitting  ATP  to  ADP  and  Pi  to  provide  free  energy  for 
the  ligation  reaction,  and  dissociation  of  Prod  A  and  by-products  from  the  enzyme  -  is  treated  as 
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a  single  lumped  macro-reaction  in  this  version  of  the  model.  The  reaction  propensity  is 
expressed  as 


as  =  cs  *  Pr  °  -  A  *  /8  {ATP ,  Sub  _  A,  Sub  _B)/VR 
where  the  f-factor 


fs(ATP,AA  _  C,  AA  _  E) 


f  ATP 

* 

Sub  _  A 

* 

'  Sub  _  B  ' 

v  ATP  +  kn&  _  ATP  y 

v  Sub  _A  +  kmS  _  Sub  _  A , 

v  Sub  _  B  +  km%  _  Sub  _  B  4 

again  modulates  the  reaction  propensity  to  assure  that  it  does  not  exceed  the  maximum 
propensity,  C7*Pro_AA^R,  when  all  substrates  are  saturating  and  that  the  propensity  goes  to  zero 
when  any  substrate  is  depleted.  The  product  of  the  reaction  is  Prod_A  along  with  the  by¬ 
products  ADP  and  inorganic  phosphate. 

Degradation  Reactions 

There  are  five  degradation  reactions  in  this  model:  one  for  degradation  of  mRNA  and 
four  for  proteolytic  degradation  of  the  proteins  present  in  the  biomolecular  reaction  network. 
Since  there  is  only  one  mRNA  present,  that  reaction  is  treated  as  an  independent  reaction. 
However,  since  there  are  four  proteins  present  in  the  system  -  Pro_A,  T7_mRNAp,  the  generic 
RNase,  the  generic  protease  Prot,  they  are  all  subject  to  degradation  by  the  single  generic 
protease  Prot.  Since  this  model  assumes  that  the  same  protease  degrades  all  four  proteins,  then  it 
is  necessary  to  use  a  slightly  modified  formulation  for  the  propensity  that  takes  into 
consideration  competition  between  the  substrates  for  degradation. 

Reaction  r4:  Degradation  of  geneA_mRNA  by  a  generic  RNase  resulting  in  the  formation  of  the 
constituent  nucleotide  mono-phosphates. 

geneAmRNA  +  RNase  ->  RNase  +  381  UMP  +  429  CMP  +  377  AMP  +  369  GMP 

The  mRNA  degradation  reaction  (r4)  introduces  the  possibility  that  geneA  mRNA  will  be 
degraded  by  the  generic  RNase.  The  products  of  the  degradation  process  are  the  nucleotide 
monophophates  (NMPs:  AMP,  GMP,  CMP,  UMP).  Here,  the  mRNA  degradation  reaction  is 
treated  as  a  lumped  macro-reaction  again  using  an  approximate  phenomenological  formulation 
for  the  propensity. 

a4  =  c4  *  RNase  *  f4(geneA  _ mRNA ) 


where 
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f 4  = 


geneA  _  mRNA 

geneA  _  mRNA  +  k4  _  geneA  _  mRNA 


the  hyperbolic  functional  form.  The  stochastic  reaction  parameter  cv  for  this  reaction  is  set  low 
in  the  current  model  so  that  there  is  little  degradation  of  mRNA. 

Reaction  r9:  Degradation  of  the  Pro_A  protein  by  a  generic  protease  (Prot)  resulting  in  the 
formation  of  the  constituent  amino  acids. 


Pro_A  +  Prot  ->  Prot  +  44A  +  9C  +  27D  +  43E  +  22F  +  40G  +  7H  +  31I  +  23K  +  53L 
+  13M+18N  +  25P  +  20Q  +  34R  +  29S  +  29T  +  21V  +  8W  +  21Y 

Since  this  reaction  is  treated  as  a  lumped,  macro-reaction  and  the  fact  that  the  protease  is 
assumed  to  degrade  all  proteins  present  in  the  biomolecular  reaction  network,  the 
phenomenological  propensity  is  given  by 

a9  =  cg  *  Pr  ot  *  f9  (Pr  o  _  A,  RNase ,  T1  _  RNAp ,  Pr  ot ) 

where 


n  = 


Pr  o  A 


Pr  o  A  +  km9  Pr  o  A  * 


1  + 


RNase 


T1  _  RNAp 


-  +  - 


Pr  ot 


km9  _  RNase  km9_Tl_RNAp  km9  ot 


This  expression  is  a  phenomenological  factor  that  modulates  the  propensity  for  reaction  r9  to 
compensate  for  utilization  of  the  enzyme  by  the  other  substrates. 

Reaction  riO:  Degradation  of  RNase  protein  by  the  generic  protease  resulting  in  the  formation 
of  the  constituent  amino  acids.  (The  amino  acid  composition  of  the  RNase  in  the  following 
reaction  is  an  arbitrary  placeholder) 

RNase  +  Prot  ->  Prot  +  50  A  +  50  C  +  50  D  +  50  E  +  50  F  +  50  G  +  50  H  +  50  I  +  50  K  +  50 
L  +  50  M  +  50  N  +  50  P  +  50  Q  +  50  R  +  50  S  +  50  T  +  50  V  +  50  W  +  50  Y 

Since  this  reaction  is  treated  as  a  lumped,  macro-reaction  and  the  fact  that  the  protease  is 
assumed  to  degrade  all  proteins  present  in  the  biomolecular  reaction  network,  the 
phenomenological  propensity  is  given  by 


a10  ~  cio  *  Pr  ot  *  fl0  ( RNase ,  Pr  o  _A,T7  _  RNAp  ,  Pr  ot) 


where 
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RNase 


/io  = 


RNase  +  kmlO  RNase  * 


1  + 


Pr  o  _  A 


-  +  - 


T1  _  RNAp 


Pr  ot 


km\0_Vvo_A  kmlO  _T1  _RNAp  kmlO  _Pr  ot 


Reaction  rl  1 :  Degradation  of  T7  polymerase  protein  by  the  generic  protease  resulting  in  the 
formation  of  the  constituent  amino  acids.  (The  amino  acid  composition  of  the  T7  polymerase  in 
the  following  reaction  is  an  arbitrary  placeholder) 

T7_RNAp  +  Prot  ->  Prot  +  50  A  +  50  C  +  50  D  +  50  E  +  50  F  +  50  G  +  50  H  +  50  I  +  50  K  + 
50  L  +  50  M  +  50  N  +  50  P  +  50  Q  +  50  R  +  50  S  +  50  T  +  50  V  +  50  W  +  50  Y 

Since  this  reaction  is  treated  as  a  lumped,  macro-reaction  and  the  fact  that  the  protease  is 
assumed  to  degrade  all  proteins  present  in  the  biomolecular  reaction  network,  the 
phenomenological  propensity  is  given  by 


a1 1  =cn  *  Pr  ot  *  /, ,  (T1  _  RNAp ,  Pr  o  _  A,  RNase ,  Pr  ot) 


where 


fn  = 


T1  _  RNAp 


T1  _  RNAp  +km\\_Tl  _RNAp 


1  + 


Pro  A 


■  +  ■ 


RNase 


■  +  ■ 


Pr  ot 


kml  1  Pr  o  A  kml  1  RNase  kml  1  Pr  ot 


Reaction  ri2:  Self-degradation  of  the  generic  protease  protein  resulting  in  the  formation  of  the 
constituent  amino  acids. (The  amino  acid  composition  of  the  generic  protease  Pro  in  the 
following  reaction  is  a  placeholder) 

Prot  +  Prot  ->  Prot  +  50  A  +  50  C  +  50  D  +  50  E  +  50  F  +  50  G  +  50  H  +  50  I  +  50  K  +  50  L 
+  50  M  +  50  N  +  50  P  +  50  Q  +  50  R  +  50  S  +  50  T  +  50  V  +  50  W  +  50  Y 

Since  this  reaction  is  also  treated  as  a  lumped,  macro-reaction  and  the  fact  that  the  protease  is 
assumed  to  degrade  all  proteins  present  in  the  biomolecular  reaction  network,  the 
phenomenological  propensity  is  given  by 


an  =  Ci2  *  Pr  ot  *  /12 (Pr ot,  Pr  o  _  A,  T1  _  RNAp  ,  RNase ) 


where 


fi: 


Pr  ot 


Pr  ot  +  kml2  Pr  ot  * 


1  + 


Pr  o  A 


-  + 


T1  _  RNAp 


-  +  - 


RNase 


km\  2_Pr  o  _  A  km\  2  _T1  _  RNAp  km\  2  _  RNase 
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Code  for  the  SBML  model  description  of  the  geneA_CFTTjOpO  model 


<?xml  version="1.0"  encoding="UTF-8"?> 

<sbml  xmlns="http://www.sbml.org/sbml/level1"  level="1"  version="1"> 

<model  name-'geneA"> 

<listOfCompartments> 

<compartment  name-"'  volume='T7> 

</listOfCompartments> 

<listOfSpecies> 

<specie  name="T7_RNAp"  compartment-'CI "  initialAmount-'30T7> 

<specie  name="T7_P"  compartment-'CI"  initialAmount="1'7> 

<specie  name=''T7_RNAp_T7_P"  compartment="C1"  initialAmount="0"/> 
<specie  name="geneA_mRNA"  compartment=''C1"  initialAmount=''0'7> 
<specie  name="Rib_s"  compartment-'CI"  initialAmount-'30T7> 

<specie  name-'RibJ"  compartment-'CI"  initialAmount="301'7> 

<specie  name="Rib_s_geneA_mRNA"  compartment="C1"  initialAmount="0"/> 
<specie  name="Pro_A"  compartment="C1"  initialAmount="0"/> 

<specie  name=''Prot"  compartment="C1"  initialAmount-'30T7> 

<specie  name-'RNase"  compartment-'CI"  initialAmount="301'7> 

<specie  name-'GC"  compartment-'CI ''  initialAmount-'0'7> 

<specie  name-' ATP"  compartment-'CI"  initialAmount="189329"/> 

<specie  name-'ADP"  compartment="C1"  initialAmount-'29197'7> 

<specie  name-'AMP"  compartment-'CI"  initialAmount="0"/> 

<specie  name=''GTP"  compartment="C1"  initialAmount=''234178'7> 

<specie  name-'GDP"  compartment-'CI"  initialAmount="0"/> 

<specie  name-'GMP"  compartment-'CI"  initialAmount="0"/> 

<specie  name-'CTP"  compartment-'CI"  initialAmount="150500"/> 

<specie  name-'CMP"  compartment="C1"  initialAmount="6622"/> 

<specie  name-'UTP"  compartment-'CI"  initialAmount="1 66754 "/> 

<specie  name-'UMP"  compartment-'CI"  initialAmount=”0'7> 

<specie  name-'Pi"  compartment-'CI"  initialAmount="0"/> 

<specie  name-'PPi"  compartment-'CI "  initialAmount="0"/> 

<specie  name=''AA_A"  compartment=''C1"  initialAmount="43645'7> 

<specie  name="AA_C"  compartment="C1"  initialAmount="94213'7> 

<specie  name="AA_D"  compartment="C1"  initialAmount="61705'7> 

<specie  name="AA_E"  compartment-'CI"  initialAmount-'79163"/> 

<specie  name="AA_F"  compartment="CT'  initialAmount="45752"/> 

<specie  name="AA_G"  compartment="C1"  initialAmount="76454'7> 

<specie  name="AA_H"  compartment="C1"  initialAmount="3341 1'7> 

<specie  name="AA_l"  compartment=''CT'  initialAmount="52976'7> 

<specie  name-'AA_K"  compartment-'CI"  initialAmount="21973'7> 

<specie  name="AA_L"  compartment="C1"  initialAmount="44247'7> 

<specie  name=''AA_M"  compartment=''C1"  initialAmount="37324"/> 

<specie  name="AA_N"  compartment="C1"  initialAmount="12943'7> 

<specie  name-'AA_P"  compartment="C1"  initialAmount=''205884'7> 

<specie  name="AA_Q"  compartment-'CT'  initialAmount="2107'7> 

<specie  name=''AA_R"  compartment="C1"  initialAmount="49665'7> 

<specie  name=''AA_S"  compartment=''C1"  initialAmount="35217"/> 

<specie  name="AA_T"  compartment="C1"  initialAmount="8428"/> 

<specie  name-'AA_V"  compartment=''C1"  initialAmount="37926"/> 

<specie  name="AA_W"  compartment="C1"  initialAmount="29498"/> 

<specie  name=''AA_Y"  compartment=''C1"  initialAmount="56588'7> 
</listOfSpecies> 

<listOfParameters> 
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<parameter  name="vol"  value=''5e-016'7> 

</listOfParameters> 

<listOfReactions> 

<reaction  name="r1"  reversible="false"> 

<listOfReactants> 

<specieReference  specie-'T7_RNAp7> 

<specieReference  specie=''T7_P7> 

</listOfReactants> 

<listOfProducts> 

<specieReference  specie="T7_RNAp_T7_P7> 

</listOfProducts> 

<kineticLaw  formula="c1  *T7_RNAp*T7_P/vol"> 

<listOfParameters> 

<parameter  name-'cl"  value="1e-0187> 

<parameter  name="vol"  value="5e-0167> 

</listOfParameters> 

</kineticLaw> 

</reaction> 

<reaction  name="r2"  reversible="false"> 

<listOfReactants> 

<specieReference  specie="T7_RNAp_T7_P7> 

</listOfReactants> 

<NstOfProducts> 

<specieReference  specie="T7_RNAp7> 

<specieReference  specie="T7_P7> 

</listOfProducts> 

<kineticLaw  formula-'c2*T7_RNAp_T7_P"> 

<listOfParameters> 

<parameter  name-'c2"  value="0.17> 

</listOfParameters> 

</kineticLaw> 

</reaction> 

<reaction  name="r3"  reversible="false"> 

<NstOfReactants> 

<specieReference  specie="T7_RNAp_T7_P7> 

<specieReference  specie="ATP"  stoichiometry="3817> 

<specieReference  specie="GTP"  stoichiometry="4297> 

<specieReference  specie="UTP"  stoichiometry="3697> 

<specieReference  specie="CTP"  stoichiometry="3777> 

</listOfReactants> 

<listOfProducts> 

<specieReference  specie='T7_RNAp7> 

<specieReference  specie='T7_P7> 

<specieReference  specie="geneA_mRNA7> 

<specieReference  specie-'PPi"  stoichiometry="15567> 

</listOfProducts> 

<kineticLaw 

formula="c3*(T7_RNAp_T7_P/vol)*(ATP/(ATP+km3_ATP))*(GTP/(GTP+km3_GTP))*(CTP/(CTP+km3_C 

TP))*(UTP/(UTP+km3_UTP))"> 

<listOfParameters> 

<parameter  name="c3"  value="5e-0177> 

<parameter  name="vol"  value="5e-0167> 

<parameter  name="km3_ATP"  value="30V7> 

<parameter  name="km3_GTP"  value="3017> 

<parameter  name="km3_CTP"  value="3017> 

<parameter  name="km3_UTP"  value="3017> 
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</listOfParameters> 

</kineticl_aw> 

</reaction> 

<reaction  name="r4"  reversible="false"> 

<listOfReactants> 

<specieReference  specie="geneA_mRNA'V> 

<specieReference  specie-'RNase'7> 

</listOfReactants> 

<listOfProducts> 

<specieReference  specie="RNase'7> 

<specieReference  specie="AMP"  stoichiometry=''38r7> 

<specieReference  specie=''GMP"  stoichiometry="429'7> 

<specieReference  specie="UMP"  stoichiometry="369'7> 

<specieReference  specie-'CMP"  stoichiometry="377'7> 

</listOfProducts> 

<kineticLaw 

formula="c4*(Prot/vol)*RNase/(RNase+km4_RNase*(1+(T7_RNAp/km4_T7_RNAp)+(Pro_A/km4_Pro_A) 

+(Prot/km4_Prot)))"> 

<listOfParameters> 

<parameter  name="c4"  value="5e-033'7> 

<parameter  name="km4_RNase"  value="301'7> 

<parameter  name="km4_T7_RNAp"  value-'301'7> 

<parameter  name="km4_Prot"  value="301'7> 

<parameter  name="km4_Pro_A"  value="301'7> 

<parameter  name="vol"  value="5e-016'7> 

</listOfParameters> 

</kineticLaw> 

</reaction> 

<reaction  name="r5"  reversible="false"> 

<listOfReactants> 

<specieReference  specie="geneA_mRNA'7> 

<specieReference  specie="Rib_s'7> 

</listOfReactants> 

<NstOfProducts> 

<specieReference  specie="Rib_s_geneA_mRNA'7> 

</listOfProducts> 

<kineticLawformula=''c5*Rib_s*geneA_mRNA/vol"> 

<listOfParameters> 

<parameter  name="c5"  value-'1e-018'7> 

<parameter  name="vol"  value="5e-016'7> 

</listOfParameters> 

</kineticl_aw> 

</reaction> 

<reaction  name="r6"  reversible="false"> 

<NstOfReactants> 

<specieReference  specie="Rib_s_geneA_mRNA'7> 

</listOfReactants> 

<NstOfProducts> 

<specieReference  specie="geneA_mRNA'7> 

<specieReference  specie="Rib_s'7> 

</listOfProducts> 

<kineticl_aw  formula="c6*Rib_s_geneA_mRNA"> 

<listOfParameters> 

<parameter  name-'c6"  value="0.1'7> 

</listOfParameters> 

</kineticLaw> 
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</reaction> 

<reaction  name="r7"  reversible="false"> 

<NstOfReactants> 

<specieReference  specie="Rib_s_geneA_mRNA"/> 

<specieReference  specie=''RibJ"/> 

<specieReference  specie-'GTP"  stoichiometry="1552'V> 

<specieReference  specie="ATP"  stoichiometry="517'7> 

<specieReference  specie=''AA_A"  stoichiometry="447> 

<specieReference  specie="AA_C"  stoichiometry-'9'7> 

<specieReference  specie=''AA_D"  stoichiometry="27'7> 

<specieReference  specie-' AA_E"  stoichiometry="43'7> 

<specieReference  specie="AA_F"  stoichiometry="22'7> 

<specieReference  specie=''AA_G"  stoichiometry="40'7> 

<specieReference  specie="AA_H"  stoichiometry="7"/> 

<specieReference  specie=''AA_r'  stoichiometry=''31'7> 

<specieReference  specie-' AA_K"  stoichiometry="23'7> 

<specieReference  specie=''AA_L"  stoichiometry="53"/> 

<specieReference  specie=''AA_M"  stoichiometry="13'7> 

<specieReference  specie="AA_N"  stoichiometry="18"/> 

<specieReference  specie=''AA_P"  stoichiometry="25'7> 

<specieReference  specie-' AA_Q"  stoichiometry="20'7> 

<specieReference  specie=''AA_R"  stoichiometry="34'7> 

<specieReference  specie=''AA_S"  stoichiometry="29"/> 

<specieReference  specie=''AA_T"  stoichiometry="29'7> 

<specieReference  specie=''AA_W"  stoichiometry="8"/> 

<specieReference  specie=''AA_V"  stoichiometry="21'7> 

<specieReference  specie=''AA_Y"  stoichiometry="21'7> 

</listOfReactants> 

<listOfProducts> 

<specieReference  specie=''geneA_mRNA'7> 

<specieReference  specie="Rib_s"/> 

<specieReference  specie-'RibJ"/> 

<specieReference  specie=''Pro_A'7> 

<specieReference  specie=''Pi"  stoichiometry="2069'7> 

<specieReference  specie=''GDP"  stoichiometry-'1552"/> 

<specieReference  specie="ADP"  stoichiometry=''517'7> 

</listOfProducts> 

<kineticl_aw 

formula="c7*(Rib_l/vol)*(ATP/(ATP+km7_ATP))*(GTP/(GTP+km7_GTP))*(Rib_s_geneA_mRNA/(Rib_s_g 

eneA_mRNA+km7_Rib_s_geneA_mRNA))*(AA_l/(AA_l+km7_AA_l))*(AA_M/(AA_M+km7_AA_M))*(AA_ 

T/(AA_T+km7_AA_T))*(AA_D/(AA_D+km7_AA_D))*(AA_P/(AA_P+km7_AA_P))*(AA_F/(AA_F+km7_AA_ 

F))*(AA_R/(AA_R+km7_AA_R))*(AA_G/(AA_G+km7_AA_G))*(AA_L/(AA_L+km7_AA_L))*(AA_W/(AA_W 

+km7_AA_W))*(AA_N/(AA_N+km7_AA_N))*(AA_K/(AA_K+km7_AA_K))*(AA_C/(AA_C+km7_AA_C))*(A 

A_H/(AA_H+km7_AA_H))*(AA_Q/(AA_Q+km7_AA_Q))*(AA_A/(AA_A+km7_AA_A))*(AA_E/(AA_E+km7_ 

AA_E))*(AA_S/(AA_S+km7_AA_S))*(AA_Y/(AA_Y+km7_AA_Y))*(AA_V/(AA_V+km7_AA_V))"> 

<listOfParameters> 

<parameter  name="c7"  value="2.5e-018'7> 

<parameter  name="vol"  value="5e-016"/> 

<parameter  name="km7_ATP"  value="301'7> 

<parameter  name=''km7_GTP"  value="301"/> 

<parameter  name=''km7_Rib_s_geneA_mRNA"  value="301'7> 

<parameter  name=''km7_AAJ''  value=''30r7> 

<parameter  name=''km7_AA_M"  value="301"/> 

<parameter  name=''km7_AA_T''  value=''301'7> 

<parameter  name=''km7_AA_D"  value="301"/> 

<parameter  name="km7_AA_P"  value="301"/> 
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<parameter  name="km7_AA_F"  value="301'V> 

<parameter  name="km7_AA_R"  value="301"/> 

<parameter  name="km7_AA_G"  value="301'V> 

<parameter  name="km7_AA_L"  value="301'7> 

<parameter  name="km7_AA_W"  value="30r7> 
parameter  name="km7_AA_N"  value="301'7> 

<parameter  name="km7_AA_K"  value="301'7> 

<parameter  name="km7_AA_C"  value="301'7> 

<parameter  name-'km7_AA_H"  value="301'7> 

<parameter  name="km7_AA_Q"  value="301'7> 

<parameter  name="km7_AA_A"  value="301'7> 

<parameter  name="km7_AA_E"  value="301'7> 

<parameter  name="km7_AA_S"  value="301'7> 

<parameter  name="km7_AA_Y"  value="301'7> 

<parameter  name="km7_AA_V"  value="301'7> 

</listOfParameters> 

</kineticLaw> 

</reaction> 

<reaction  name="r8"  reversible="false"> 

<listOfReactants> 

<specieReference  specie=''Pro_A'7> 

<specieReference  specie-' AA_E'7> 

<specieReference  specie="ATP'7> 

<specieReference  specie="AA_C'7> 

</listOfReactants> 

<NstOfProducts> 

<specieReference  specie="Pro_A'7> 

<specieReference  specie="ADP'7> 

<specieReference  specie="Pi'7> 

<specieReference  specie="GC'7> 

</listOfProducts> 

<kineticLaw 

formula="c8*(Pro_A/vol)*(AA_E/(AA_E+km8_AA_E))*(ATP/(ATP+km8_ATP))*(AA_C/(AA_C+km8_AA_C 

))"> 

<listOfParameters> 

<parameter  name="c8"  value="2.5e-016'7> 

<parameter  name="km8_ATP"  value="301"/> 

<parameter  name="km8_AA_E"  value="301"/> 

<parameter  name="km8_AA_C"  value="301'7> 

<parameter  name="vol"  value="5e-016"/> 

</listOfParameters> 

</kineticLaw> 

</reaction> 

<reaction  name="r9"  reversible="false"> 

<NstOfReactants> 

<specieReference  specie="Pro_A'7> 

<specieReference  specie="Prot'7> 

</listOfReactants> 

<NstOfProducts> 

<specieReference  specie="Prot7> 

<specieReference  specie="AA_A"  stoichiometry="44'7> 

<specieReference  specie="AA_C"  stoichiometry="9"/> 

<specieReference  specie=''AA_D"  stoichiometry="27'7> 

<specieReference  specie-' AA_E"  stoichiometry="43'7> 

<specieReference  specie=''AA_F"  stoichiometry="22'7> 

<specieReference  specie="AA_G"  stoichiometry="40'7> 
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<specieReference  specie-' AA_H"  stoichiometry="7"/> 

<specieReference  specie=''AA_r'  stoichiometry="31'7> 

<specieReference  specie="AA_K"  stoichiometry="23'7> 

<specieReference  specie-' AA_L"  stoichiometry="53'7> 

<specieReference  specie=''AA_M"  stoichiometry="13'7> 

<specieReference  specie="AA_N"  stoichiometry="18'7> 

<specieReference  specie=''AA_P"  stoichiometry="25'7> 

<specieReference  specie="AA_Q"  stoichiometry="20'7> 

<specieReference  specie-' AA_R"  stoichiometry="34'7> 

<specieReference  specie=''AA_S"  stoichiometry="29"/> 

<specieReference  specie=''AA_T”  stoichiometry="29'7> 

<specieReference  specie=''AA_W"  stoichiometry="8"/> 

<specieReference  specie=''AA_V"  stoichiometry="21'7> 

<specieReference  specie-' AA_Y"  stoichiometry="21'7> 

</listOfProducts> 

<kineticLaw 

formula="c9*(Prot/vol)*Pro_A/(Pro_A+km9_Pro_A*(1+(RNase/km9_RNase)+(T7_RNAp/km9_T7_RNAp) 

+(Prot/km9_Prot)))"> 

<listOfParameters> 

<parameter  name=''c9"  value="5e-030'7> 

<parameter  name="km9_Pro_A"  value="301"/> 

<parameter  name="km9_RNase"  value="301"/> 

<parameter  name="km9_T7_RNAp"  value="301'7> 

<parameter  name="km9_Prot"  value="301'7> 

<parameter  name="vol"  value="5e-016"/> 

</listOfParameters> 

</kineticLaw> 

</reaction> 

<reaction  name=''r10"  reversible-'false"> 

<listOfReactants> 

<specieReference  specie=''RNase'7> 

<specieReference  specie-'Prot'7> 

</listOfReactants> 

<listOfProducts> 

<specieReference  specie=''Prot'7> 

<specieReference  specie=''AA_A"  stoichiometry="50"/> 

<specieReference  specie=''AA_C"  stoichiometry="50'7> 

<specieReference  specie=''AA_D"  stoichiometry="50'7> 

<specieReference  specie-' AA_E"  stoichiometry="50'7> 

<specieReference  specie=''AA_F"  stoichiometry="50'7> 

<specieReference  specie=''AA_G"  stoichiometry="50'7> 

<specieReference  specie=''AA_H"  stoichiometry="50'7> 

<specieReference  specie=''AA_l"  stoichiometry="50'7> 

<specieReference  specie-' AA_K"  stoichiometry=''50"/> 

<specieReference  specie=''AA_L"  stoichiometry="50'7> 

<specieReference  specie=''AA_M"  stoichiometry="50'7> 

<specieReference  specie=''AA_N"  stoichiometry="50'7> 

<specieReference  specie=''AA_P"  stoichiometry=''50"/> 

<specieReference  specie-' AA_Q"  stoichiometry="50'7> 

<specieReference  specie=''AA_R"  stoichiometry="50'7> 

<specieReference  specie=''AA_S"  stoichiometry="50'7> 

<specieReference  specie=''AA_T"  stoichiometry="50'7> 

<specieReference  specie="AA_W"  stoichiometry="50'7> 

<specieReference  specie-' AA_V"  stoichiometry=''50"/> 

<specieReference  specie=''AA_Y"  stoichiometry="50'7> 

</listOfProducts> 
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<kineticl_aw 

formulat’d  0*(Prot/vol)*RNase/(RNase+km  1 0_RNase*(1  +(T7_RN  Ap/km  1 0_T7_RN  Ap)+(Pro_A/km  1 0_Pr 
o_A)+(Prot/km  1 0_Prot)))"> 

<listOfParameters> 

<parameter  name=”c10"  value=”5e-0307> 

<parameter  name="km10_RNase"  value="301"/> 

<parameter  name=”km10_T7_RNAp”  value="301'V> 

<parameter  name=”km10_Prot"  value=”3017> 

<parameter  name=”km10_Pro_A"  value="301'V> 

<parameter  name="vol"  value="5e-016"/> 

</listOfParameters> 

</kineticLaw> 

</reaction> 

<reaction  name="r11"  reversible="false"> 

<NstOfReactants> 

<specieReference  specie=''T7_RNAp7> 

<specieReference  specie=''Prot"/> 

</listOfReactants> 

<listOfProducts> 

<specieReference  specie="Prot7> 

<specieReference  specie="AA_A"  stoichiometry="507> 

<specieReference  specie="AA_C"  stoichiometry="50'7> 

<specieReference  specie=”AA_D"  stoichiometry="50'7> 

<specieReference  specie=”AA_E"  stoichiometry="50'7> 

<specieReference  specie="AA_F"  stoichiometry="50'7> 

<specieReference  specie=”AA_G"  stoichiometry="50'7> 

<specieReference  specie=”AA_H"  stoichiometry="50'7> 

<specieReference  specie=”AA_r  stoichiometry=''50'7> 

<specieReference  specie=''AA_K"  stoichiometry="507> 

<specieReference  specie="AA_L"  stoichiometry="50'7> 

<specieReference  specie=''AA_M"  stoichiometry="507> 

<specieReference  specie="AA_N"  stoichiometry="50'7> 

<specieReference  specie=''AA_P"  stoichiometry="507> 

<specieReference  specie="AA_Q"  stoichiometry="50'7> 

<specieReference  specie=”AA_R"  stoichiometry="50'7> 

<specieReference  specie=''AA_S"  stoichiometry="507> 

<specieReference  specie="AA_T"  stoichiometry="50'7> 

<specieReference  specie=”AA_W"  stoichiometry="50'7> 

<specieReference  specie=''AA_V"  stoichiometry="507> 

<specieReference  specie=''AA_Y"  stoichiometry="507> 

</listOfProducts> 

<kineticLaw 

formula=”c1 1*(Prot/vol)*T7_RNAp/(T7_RNAp+km1 1_T7_RNAp*(1+(Pro_A/km1 1_Pro_A)+(RNase/km1 1_ 
RNase)+(Prot/km1 1_Prot)))”> 

<listOfParameters> 

<parameter  name=”c1 1"  value="5e-030"/> 

<parameter  name="km1 1_RNase”  value=”301'7> 

<parameter  name="km1 1_Prot"  value=''3017> 

<parameter  name="km1 1_Pro_A"  value="301'7> 

<parameter  name="km1 1_T7_RNAp"  value="301'7> 

<parameter  name="vor'  value="5e-016"/> 

</listOfParameters> 

</kineticLaw> 

</reaction> 

<reaction  name="r12"  reversible="false"> 

<NstOfReactants> 
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<specieReference  specie="Prot'7> 

<specieReference  specie="Prot'7> 

</listOfReactants> 

<listOfProducts> 

<specieReference  specie="Prot7> 

<specieReference  specie="AA_A"  stoichiometry="50'7> 

<specieReference  specie="AA_C"  stoichiometry="50'7> 

<specieReference  specie-' AA_D"  stoichiometry="50'7> 

<specieReference  specie=''AA_E"  stoichiometry="50'7> 

<specieReference  specie="AA_F"  stoichiometry="50'7> 

<specieReference  specie=''AA_G"  stoichiometry="50'7> 

<specieReference  specie=''AA_H"  stoichiometry="50'7> 

<specieReference  specie-' AAJ"  stoichiometry="50'7> 

<specieReference  specie="AA_K"  stoichiometry="50"/> 

<specieReference  specie=''AA_L"  stoichiometry="50'7> 

<specieReference  specie=''AA_M''  stoichiometry-'50"/> 

<specieReference  specie=''AA_N"  stoichiometry=''50'7> 

<specieReference  specie-' AA_P"  stoichiometry="50'7> 

<specieReference  specie=''AA_Q"  stoichiometry="50'7> 

<specieReference  specie=''AA_R"  stoichiometry="50'7> 

<specieReference  specie=''AA_S"  stoichiometry="50'7> 

<specieReference  specie=''AA_T"  stoichiometry="50'7> 

<specieReference  specie=''AA_W"  stoichiometry="50'7> 

<specieReference  specie=''AA_V"  stoichiometry="50"/> 

<specieReference  specie=''AA_Y"  stoichiometry="50'7> 

</listOfProducts> 

<kineticLaw 

formula="c12*(Prot/vol)*Prot/(Prot+km12_Prot*(1+(T7_RNAp/km12_T7_RNAp)+(Pro_A/km12_Pro_A)+(R 
Nase/km  1 2_RNase)))”> 

<listOfParameters> 

<parameter  name=''c12"  value="5e-030"/> 

<parameter  name-'km12_RNase''  value-'301"/> 

<parameter  name="km12_T7_RNAp"  value="301'7> 

<parameter  name="km12_Prot"  value="301"/> 

<parameter  name="km12_Pro_A"  value="301'7> 

<parameter  name="vol"  value="5e-016'7> 

</listOfParameters> 

</kineticLaw> 

</reaction> 

</listOfReactions> 

</model> 

</sbml> 
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Model  Parameters 
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Table  C.2:  Numerical  values  for  model  reaction  parameters  for  the  discrete-stochastic  (D-S)  and 
the  continuous-deterministic  (C-D)  models. 


Reaction  Parameters 

Parameter 

D-S  Model 

C-D  Model 

rl  -  association  of  T7  RNA  polymerase 
(T7_RNAp)  with  T7  promoter  (T7_P)  on  geneA 
gene 

kl 

1.0e-18 

6.02e-01 

r2  -  dissociation  of  T7  RNA  polymerase-T7 
promoter  complex 

k2 

1.0e-01 

1.0e-01 

r3  -  transcription  of  the  geneA  gene  to  form 
geneA_mRNA  (lumped  reaction) 

k3 

5e-017 

1.0e-01 

km3  ATP 

301 

1 

km3  GTP 

301 

1 

km3  CTP 

301 

1 

km3 UTP 

301 

1 

r4  -  degradation  of  geneA  mRNA  by  RNase 
(lumped  reaction) 

k4 

5.0e-025 

1.0e-09 

km4geneAmRNA 

301 

1 

r5  -  association  of  the  small  ribosomal  unit 
(Rib  s)  with  geneA  mRNA  ribosomal  binding 
site 

k5 

le-18 

6.02e-01 

r6  -  dissociation  of  Rib_s-geneA_mRNA 
complex 

k6 

le-01 

le-01 

r7  -  translation  of  geneA  mRNA  to  form  the 
Pro_A  protein  (lumped  reaction) 

k7 

2.5e-18 

5e-03 

km7  geneA  mRNA 

301 

1 

km7  ATP 

301 

1 

km7  GTP 

301 

1 

km7  AA  I 

301 

1 

km7_AA_M 

301 

1 

km7  AA  T 

301 

1 

km7  AA  D 

301 

1 

km7  AA  P 

301 

1 

km7  AA  F 

301 

1 

km7_AA_R 

301 

1 

km7  AA  G 

301 

1 

km7  AA  L 

301 

1 

km7  AA  W 

301 

1 

km7_AA_N 

301 

1 

km7  AA  K 

301 

1 

km7  AA  C 

301 

1 

km7  AA  H 

301 

1 

km7  AA  Q 

301 

1 

km7_AA_A 

301 

1 

km7  AA  E 

301 

1 

km7_AA_S 

301 

1 

km7  AA  Y 

301 

1 
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km7_AA_V 

301 

1 

r8  -  production  of  ProdA  using  Pro_A  (lumped 
reaction) 

k8 

7.0e-017 

1.4e-01 

km8  ATP 

301 

1 

km8  AA  C 

301 

1 

km8  AA  E 

301 

1 

r9  -  degradation  of  Pro_A  by  the  generic 
protease  (lumped  reaction) 

k9 

5.0e-025 

1.0e-09 

km9  T7  RNAp 

301 

1 

km9  Pro  A 

301 

1 

km9_RNase 

301 

1 

km9 Prot 

301 

1 

rlO  -  degradation  of  T7_RNAp  by  the  generic 
protease  (lumped  reaction) 

klO 

5.0e-025 

1.0e-09 

kmlO  T7  RNAp 

301 

1 

kmlO  Pro  A 

301 

1 

kmlORNase 

301 

1 

kmlOProt 

301 

1 

rl  1  -  degradation  of  RNase  by  the  generic 
protease  (lumped  reaction) 

kll 

5.0e-025 

1.0e-09 

kml  1  T7  RNAp 

301 

1 

kmll  Pro  A 

301 

1 

kml  1  RNase 

301 

1 

kml  IProt 

301 

1 

rl2  -  self-degradation  of  the  generic  protease 

Prot  (lumped  reaction) 

kl2 

5.0e-025 

1.0e-09 

kml 2  T7  RNAp 

301 

1 

kml 2  Pro  A 

301 

1 

kml2_RNase 

301 

1 

kml2_Prot 

301 

1 
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Table  C.3:  Initial  conditions  for  the  state  variables  in  geneA  CFTT  OpO  model. 


Initial  Number  of  Molecules  in  State  Variables 

D-S 

(#  of  molecules)* 

C-D 

(PM) 

T7_P 

1 

0.00332 

T7_RNAp 

301 

1 

T7_RNAp-T7_P 

0 

0 

ATP 

189329 

629 

GTP 

234178 

778 

CTP 

150500 

500 

UTP 

166754 

554 

geneA_mRNA 

0 

0 

PPi 

0 

0 

AMP 

0 

0 

UMP 

0 

0 

CMP 

6622 

22 

GMP 

0 

0 

Ribs 

301 

1 

Rib_s-geneA_mRNA 

0 

0 

Rib_l 

301 

1 

AA_I  -  isoleucine 

52976 

176 

AA_M  -  methionine 

37324 

124 

AA_T  -  threonine 

8428 

28 

AA_D  -  aspartic  acid 

61705 

205 

AA  P  -  proline 

205884 

684 

AA  F  -  phenylalanine 

45752 

152 

AA  R  -  arginine 

49665 

165 

AA_G  -  glycine 

76454 

254 

AA_L  -  leucine 

44247 

147 

AA_W  -  tryptophan 

29498 

98 

AA_N  -  asparagine 

12943 

43 

AA_K  -  lysine 

21973 

73 

AA_C  -  cysteine 

94213 

313 

AA_H  -  histidine 

33411 

111 

AA_Q  -  glutamine 

2307 

7 
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AA  A  -  alanine 

43645 

145 

AA_E  -  glutamic  acid 

79163 

263 

AA  S  -  serine 

35217 

117 

AA  Y  -  tyrosine 

56588 

188 

AA  V  -  valine 

37926 

126 

Pro_A 

0 

0 

GDP 

15652 

52 

Pi 

0 

0 

ADP 

29197 

97 

Sub_A 

30100 

100 

Sub_B 

45002 

150 

ProdA 

0 

0 

RNase 

301 

1 

Prot 

301 

1 

*  Initial  number  of  molecules  of  nucleotides  and  amino  acids  are  based  on  analytical 
measurements  in  a  commercial  gene  expression  kit. 
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